Files
hpr-knowledge-base/hpr_transcripts/hpr4063.txt

228 lines
20 KiB
Plaintext
Raw Normal View History

Episode: 4063
Title: HPR4063: Re: ChatGPT Output is not compatible with CC-BY-SA
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr4063/hpr4063.mp3
Transcribed: 2025-10-25 19:09:54
---
This is Hacker Public Radio, episode 4,063 for Wednesday, the 28th of February, 2024.
Today's show is entitled ReChat GP T Output is not compatible with the Creative Commons
Attribution Share alike for International License.
It is hosted by DNT and is about 27 minutes long.
It carries a clean flag.
The summary is a response to Hacker Public Radio 3,983 in which can argue that shows
using chat GP T Output can be posted to Hacker Public Radio.
Hello and welcome to another exciting episode of Hacker Public Radio.
I am your host today.
My name is DNT.
And this is a response show to the one that can did a while back.
It's been a while now.
I did a little write-up about this after spending some time doing some research and thinking
about that, but then I ended up never recording the show.
But anyway, I picked it up today and here it is.
So I'm just going to read this stuff that I wrote that I think maybe I was going to email
the list or maybe I was always thinking about reading this for a show.
So I'm just going to read it because I don't want to take too much time on that.
So this is about the show when Ken said that content written by chat GP T or in other
chat GP T Output couldn't be in Hacker Public Radio show because it's incompatible.
Because the chat GP T license or terms are incompatible with the Creative Commons
attribution, share-like license that we use.
So I want to offer a different opinion on this here.
I mean, I shouldn't even call it an opinion.
This is, you know, I'm not a lawyer, of course, and I'm just, I am interested in things about
copyright and free culture in general.
So I spend some time thinking about this and I think there are some interesting things about it
that hackers may be interested in hearing about.
Anyway, on whether chat GP T Output can be licensed under a Creative Commons license,
I think it's true that it can't, but it doesn't mean that chat GP T Output can't be in
a Hacker Public Radio show. I would argue. It just means that that portion of the show
would not be eligible for any kind of protection under the Creative Commons license.
Because it would be unenforceable. So what that means is if I make a show and I license it under
the Creative Commons license, attribution, share-like, and I pasted my chat GP T Output,
someone else could copy the output, not the rest of the show, just the output,
and they would not have to give me attribution. They would be able to ignore those terms
of the Creative Commons license because the chat GP T Output is not eligible for that kind of
protection. And the reason I think that's the case is, so it's not that the terms of chat GP T
are incompatible, it's that chat GP T itself is incompatible with any kind of copyright
protection, which of which the Creative Commons license relies on the fact that there is
copyright protection. Through the Creative Commons license, you're exactly giving up some of
the protections that by default you get from copyright law. So I have an article that discusses
some of this that I'm going to put in the show notes. It's from December 2022. It discusses
decisions that have been made by the courts in the US about chat GP T and other generative AI.
And then here are a couple of things. So first of all, one of the concepts about copyright
in the US is that what can be copyrighted, what can be eligible for copyright protection,
are expressions of ideas, but not ideas. So if you express an idea, someone else can basically
absorb that idea, learn it from your expression of it, and they can re-express the same idea.
But as long as they did not copy your expression of the idea,
or as long as it's sufficiently different, you would not be able to claim infringement on their
part, right? Against your previous expression of the idea. So if ideas are used but not copied,
then the use would not mean copyright infringement, right? Because you don't own the idea,
you own your expression of the idea. So another important concept about copyright is that the work
for it to be copyrightable, a work must be the result of original and creative authorship by a
human author. So that's the first thing. Chat GP T is not the result of original and creative
authorship by a human author. So right there, it can't be eligible for copyright protection.
It's, there is a famous thing about someone, it was like a drawing that was made by a monkey,
or something like that, and then an artist that had staged this situation for the monkey to draw
on a piece of paper, wanted to claim ownership over the piece. But the, yeah, I think that didn't fly.
I'm maybe a little bit wrong about the details of this, but there have been discussions about this
in terms of, there have been legal decisions about this in cases where animals had produced a piece,
like a physical work that then someone wanted to claim ownership of and wanted to get copyright
protection for. So, but anyway, first going back to the ideas thing. So ideas are not copyrightable,
only expressions of ideas are copyrightable. And, you know, copyrightable, I should really say
eligible for copyright protection, right? So if Chat GP T is going around the internet, gathering ideas,
not copying them, it is not infringing on copyright. It is learning stuff about the probabilities of
what words come after what other words, right? And with that information, it's producing some new
output, right? So, I mean, yeah, there's some strange stuff about what is an idea is Chat GP T
really working with an idea here, right? What is actually going on? That I think hasn't been settled,
but there are some things that there have already been some decisions about. So in general,
it's thought that only human expression can get copyright protection. Chat GP T is not seen as
human expression. That's definite, right? So it cannot get copyright protection. Thus, Chat GP T
actually shouldn't even go so far as to say that they assign ownership of the output to you as they
do in their terms because they don't have ownership to assign in the first place, right? Instead,
it should say in their terms that no one can own the output because it is not eligible for copyright
protection. So essentially, it is, if it's public, it's in the public domain, right? Any Chat GP T
output, like as far as some court decisions have gone, that I think that would be the terms that
would actually reflect the understanding by courts in the US, right? So however, within any
CCBYSA, Creative Commons Attributions Share Like Licensed Peace, or any other piece, you can use
content that is not eligible for copyright protection because no one can challenge you for using it,
right? That's the essential reason why. It's because it's not eligible for copyright protection,
so no one can tell you you can't do it. So by the same token, if someone reused your Chat GP T
that they copied from your show, you wouldn't be able to challenge them either on the basis of your
Creative Commons license because it's not eligible for protection. However, other portions of your
show would be eligible for protection, such as your inputs, I mean, your prompts, right, that you
wrote for Chat GP T to generate this output, those prompts, they would be eligible for protection,
I would think, and whatever else you put in your show that's not the Chat GP T output.
So I think just because portions of a show aren't eligible for protection on the basis of
a Creative Commons Attribution Share Like Licensed, it doesn't mean the show can't be published
under that license, right? So of course, if you posted a show that was entirely Chat GP T output
and you tried to license it under a Creative Commons license, that license would just be
unenforceable, right? You sure you could do it, but based on recent court decisions in the US,
that license that you're trying to put on this would not be enforceable.
So, and then furthermore, as for Chat GP T terms of views, I think the terms of views apply to
using Chat GP T, not to using the output of Chat GP T, right, after you have used Chat GP T.
So since the output is not eligible for copyright protection, no one has any say over how we can
or can't be used. So on whether Chat GP T output
can be licensed under the Creative Commons license again, as can discussed in HPR 3983,
I would voice the opinion that the terms of views written by OpenAI don't place any restrictions
on the nature of Chat GP T output use only on the nature of Chat GP T use itself. So the output is
after you have used it, use Chat GP T, right? So the distinction is that a lot of times people
use Chat GP T to obtain information instead of to obtain content and actually that's the primary
purpose of Chat GP T. I don't think their purpose in creating it is to generate text
that you're going to then reuse somewhere else, right? The purpose is supposed to be to get
information. And of course, as we know, sometimes you get wrong information.
So the terms, I think, are more concerned with the information than with the content.
And here I'm talking about how you use Chat GP T to learn how to craft a bomb. That's the thing,
that's what I'm talking about here. You use Chat GP T. The terms of views talk about why are you
using Chat GP T? Are you using it to learn how to commit crimes or whatever? That's what the terms
of use are talking about. Not about how are you going to use the output of Chat GP T? No, it's
why are you how are you using Chat GP T? So let's read this little bit here from Chat GP T terms,
which can provide it in the show notes for 3983. Subject to your compliance with these terms,
open AI hereby assigns to you all its right title and interest in and to the output.
This means you can use the content for any purpose, including commercial purposes such as sale
or publication, if you comply with these terms. So unfortunately, it doesn't explicitly mention
what rights you can choose not to reserve in the output. And I haven't been able to find any
legal opinions on this, but I don't think I have ever seen any cases where you're forbidden
from relinquishing rights over intellectual property. So anyway, the way I see it, this section that I
just read, it shows that it supports the idea that the terms are about using Chat GP T and not
about using the output of Chat GP T use. So as long as you observe the terms while using Chat GP T,
the output is yours is what the thing says, the terms of use. So subsequent use or licensing
of the content is not subject to these terms, it shouldn't be assumed to be. So then the terms
that appear in the usage policies document, which I'll put the link in the show notes, in my view
support this idea even more, because they're quite clearly about Chat GP T usage as opposed to
usage of Chat GP T outputs. So let's look at section three of Chat GP T terms titled content.
Here's what it says, chapter paragraph B, similarity of content. Due to the nature of machine
learning, output may not be unique across users and the services may generate the same or similar
output for open AI or a third party. For example, you may provide input to a model such as,
quote, what color is the sky, end quote, and receive outputs such as quote, the sky is blue, end quote.
Well, other users may also ask similar questions and receive the same response,
responses that are requested by and generated for other users are not considered your content.
So this is very similar to the way trademarks that happen to be a word with a common meaning,
such as Apple, are described as quote, weak trademarks and are entitled to much less protection.
So it is questionable whether open AI even could restrict anyone's use of Chat GP T content.
In fact, I think there have been court rulings in the US, as I said earlier, declaring that
generative AI content is not eligible for protection at all. So perhaps a bigger question is whether
even the rights reserved under Creative Commons license can actually be reserved by anyone
on Chat GP T content, as I said earlier, meaning that any direct use of Chat GP T content would have
to be put in the public domain. If anything, re-licensing Chat GP T content could come to be seen as
copyright infringement upon those who hold copyright over the training data. But I think the fault
here is with open AI and assigning to you the rights over it, rights that open AI could not have
over the content themselves. If anything, what I'm trying to say here is I think it's pretty clear
that open AI does not have the authority to assign rights over Chat GP T output to anyone.
Maybe we, as a society, our courts will decide that they actually infringed upon other people's
copyright when they used their copyrighted material in their training data. I think that's
there is a very strong case there. It may very well happen. Even recently, Sam Altman admitted
that it would be impossible to train Chat GP T without infringing on copyright, which we hope
some very strong opinion will come out of that eventually. In my view, it should be,
no, we are not going to support you breaking our rules so that your company can be very
successful. That's just not how democracy works. So anyway, so now we're coming back to
Hacker Public Radio. Please don't take it to mean that I am a champion of generative AI.
My greatest hope for AI is that it will kill the web a little more. We have basically
companies have basically been treating people as generative AI and getting them to write
this really crappy copy and copying stuff around back and forth so that now you search for
anything online and all you find is all these ads and stuff. So I think there's some good potential
for all these companies to start using generative AI and for things to get even worse and to maybe
finally lead to the collapse of this web that we have today and then we'll get to rebuild
an internet that we maybe like better, like the ones that many of us here on Hacker Public Radio
mostly live in. So that's right now I think that's a good thing that could come from it, at least
this kind of a mass appeal form that ChatGPT represents. But I am a technical writer at work,
so I do see that there is some potential for some applications in the world of tech companies
that could be really interesting and I am definitely not closing my eyes to it. But as far as
Hacker Public Radio goes, I don't think we should have any considerable amount of
ChatGPT generated content of course not. I think it would be wise to have a generative AI policy
anyway, maybe declaring that we're maybe I would suggest that it could be we are open to generative AI
content as long as the subject of the show is in some way related to generative AI itself, right?
We don't want we're not going to be interested in the show that uses generative AI output
as just a means to talk about something else, right? We would prefer for you to just speak in your
own words basically is what I'm saying. But anyway I don't it's pretty obviously it's pretty obvious
that we don't have a problem with AI content creeping into Hacker Public Radio it doesn't seem so
so I wouldn't worry about it for now anyway. So anyway that's my view on all that.
I invite you to think about that and see what you think, see if you agree with me,
or if you have another opinion I would love to hear about it so please record your your response
and let's keep the conversation going. This is like an important topic of our time I think so
it's kind of I don't know seems like it's a good idea to discuss it. I'll say a couple more things
here. Here's what what else I've got written down. Let's see. Yeah so in a way I think it's kind of
like generative AI hacks copyright a little bit by sort of converting making it easier
to convert copyrightable content into non copyrightable ideas, right? Of course it doesn't do that
directly as we know it just it just figures out the probabilities of words coming after
other words but when we look at it we make ideas out of it right by accident sort of so
so that's kind of why I think it it would be a mistake for for free culture advocates to to oppose
the use of generative AI. So let me read this other bit here. So as with fair use the eligibility
for for copyright protection this is kind of the basis of the fair use doctrine. The eligibility
for a copyright protection hinges on how substantially you added your own authorship to the portion
of something that is not eligible for copyright protection. So then here's a quote from this
article from the federal register a document I mean about just copyright registration guidance.
This is the office of the US government where you where you register copyright your copyright on
stuff you know you know you you can go and register the copyright to kind of strengthen your claim
if there is an infringement later. So they wrote the ninth circuit has held that a book containing
words quote authored by non-human spiritual beings close quote can only qualify for copyright
protection if there is human selection and arrangement of the revelations. I'm not sure what context
this was about here but the article is specifically about copyright registration guidance were
for works containing material generated by artificial intelligence. So in the in the fair use
doctrine the degree of copyright protection to which such a work would be entitled is based
on the degree of human selection and arrangement of the non copyrightable source material is what
that says. So it can only qualify for copyright protection something that is not eligible for
for copyright protection such as chat GPT output it is it is only eligible for copyright protection
if there is human selection and arrangement right and how how strongly it can get protection
right is based on the degree of human selection and arrangement of the non copyrightable source
material. So what that means is like you know there's always a fair degree of judgment that's made
by the courts whenever the judge copyright infringement to determine whether something would
fall under fair use or not the fair use doctrine in the US allows for for a lot of nuance you know
so so what this means for chat GPT is that the janitors would have to make this kind of judgment
case by case on whether or not to reject a show and then they may very well decide they don't
want to do that and and that's fair enough and decide that that we're rather we're just not going
to accept any any shows that have any any chat GPT output at all that that would be completely fine
of course. So okay I'll shut up now sorry that this was a little bit messy in the end I kind of
read my notes and kept adding as I went which just ended up making some things a little bit redundant
and I mean I just hope at least some of you understood what I'm trying to say and again
let me know what you think these these are just some things that I have come to think about all this
stuff from a perspective of someone who is just interested in how how copyright works in our
society and someone who is very much a free culture advocate so um yeah um don't take too much time
just get a microphone and record your response show and then post it to hacker public radio don't
edit it much either all right so thank you for listening um and come back tomorrow for another
excellent episode of hacker public radio bye
you have been listening to hacker public radio at hacker public radio does work. Today's show was
contributed by a hbr listener like yourself if you ever thought of recording podcast and click on
our contribute link to find out how easy it really is hosting for hbr has been kindly provided
by an honesthost.com the internet archive and rsync.net on this otherwise stated today's show is
released on their creative commons attribution 4.0 international license