172 lines
15 KiB
Plaintext
172 lines
15 KiB
Plaintext
|
|
Episode: 3953
|
||
|
|
Title: HPR3953: Large language models and AI don't have any common sense
|
||
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3953/hpr3953.mp3
|
||
|
|
Transcribed: 2025-10-25 17:51:02
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
This is Hacker Public Radio Episode 3953 for Wednesday, the 27th of September 2023.
|
||
|
|
Today's show is entitled Large Language Models and I Don't Have Any Common Sense.
|
||
|
|
It is the first show by New Host Hops and is about 18 minutes long.
|
||
|
|
It carries a clean flag.
|
||
|
|
The summary is, learn how to load and run GPT-2 or LLOMA2 to test it with common sense
|
||
|
|
questions.
|
||
|
|
This is Hops and Lane and Greg Thompson.
|
||
|
|
We will be trying to load a hugging face, large language model so that you can do text
|
||
|
|
generation on your own computer without having to use somebody's proprietary API.
|
||
|
|
Hugging face has a bunch of models including chat models and large language models so you'll
|
||
|
|
need to create a hugging face account first and we'll put this link in the show notes.
|
||
|
|
It's for HuggingFace.co slash join and that's where you want to go join up.
|
||
|
|
Then you'll need to get an access token if you want to use any of the supersized models
|
||
|
|
from Meta or any other company that kind of hides them behind a business source license.
|
||
|
|
They're not really open source but they are sharing all the weights and all the data
|
||
|
|
but you just can't use them for commercial purposes if you get really big to compete
|
||
|
|
with them.
|
||
|
|
But anyway, if you need a token you'll need to get that token from your profile on a hugging
|
||
|
|
face.
|
||
|
|
You can put that token in a .nv file.
|
||
|
|
That works with a lot of Python library called .nv and that's what you use to load environment
|
||
|
|
variables and so if you put it in a .nv file it will combine those together when you load
|
||
|
|
it with your existing environment variables.
|
||
|
|
So quick tip you definitely want to use once you see import.nv and then you say .nv.load
|
||
|
|
ENV but you don't want to then say .nv.nv values because that will load the dictionary of
|
||
|
|
only your .nv, variable and value pairs and so you want the mapping of all of your environment
|
||
|
|
variables typically when you're running a server because there will be things like your
|
||
|
|
Python path with Python version, that kind of stuff that you'll probably need to use if
|
||
|
|
you're building a real web server.
|
||
|
|
So we ran into that problem when we were trying to configure our GitLab CIC ePipe line and
|
||
|
|
then we had to hit that problem again and we went over to Render to deploy our chatbot software
|
||
|
|
at carry.ai QAR Y.ai.
|
||
|
|
So once you've got your token loaded you can then say so you import you've got your .nv.loaded
|
||
|
|
into the .nv package but now you've got to import os and say dict.dict in the os.environ.
|
||
|
|
So you're going to convert the os.environ which is a dict like object you want to grab a copy
|
||
|
|
of it basically and convert it coerce it into a dictionary. So dict open parenthesis os.environ.
|
||
|
|
You should be familiar with that if you've ever worked with environment variables.
|
||
|
|
The closure parenthesis and you've got that in a dictionary we call it ENV as a variable
|
||
|
|
and then we can say ENV square bracket quote and then hugging face access token or whatever
|
||
|
|
you call your variable in that .nv. Anyway it turns out we're going to show you how to do it for
|
||
|
|
smaller models. We tried to do it for Lama 2 but that's a four gigabyte model and it takes a
|
||
|
|
long time to download and really hard to do when you're on a conference call with somebody in
|
||
|
|
Korea where Greg is located. So we are going to and so when you search for models it's really
|
||
|
|
hard to find models on hugging face unfortunately because there's so many and people can describe
|
||
|
|
them in a lot of different ways and so really hard to find what you're looking for. Don't ever
|
||
|
|
hit enter after your search query. Instead go to their full text search that'll give you more
|
||
|
|
of what you need or you can click on the like the C3358 model results for Lama 2 that's what we did
|
||
|
|
in order to to find the one we were looking for that could do chat. But like I said we're going
|
||
|
|
to skip that one and move on to a smaller one GPT2. Not actually it's not that much smaller it's
|
||
|
|
just that I've already downloaded it downloaded offline several days ago. So if you've already
|
||
|
|
downloaded it's already done this once this this process of downloading and creating a model if you've
|
||
|
|
gone through this these steps that we're describing here then you won't have to do it again
|
||
|
|
and wait for the download and so anyway so we're going to use one that I've already done this for
|
||
|
|
online. If you do need that license you'll need to apply for that license from meta meta.com if you're
|
||
|
|
trying to use Lama 2 and you can go that's under slash resources models and libraries Lama
|
||
|
|
downloads. Anyway the show notes will tell you how to do that but if you just want to use GPT2
|
||
|
|
you don't need to do that because that's two generations back on what OpenAI is building
|
||
|
|
so which is that they're up to GPT4 and they're already working on GPT5.
|
||
|
|
Let's see so now you can we're going to use instead of the auto model that a lot of people use we're
|
||
|
|
going to use the transformers pipeline object from hugging face so the pipeline will include the
|
||
|
|
model and the tokenizer and be able to help you to do enter inference you won't be able to retrain
|
||
|
|
or fine tune the model but at least you can get it to generate some text. So you say from transformers
|
||
|
|
import pipeline and then you say generator equals pipeline open parentheses text generation and
|
||
|
|
you need to give it the model name with the key model so you say comma model equals open AI dash GPT
|
||
|
|
that's open AI dash GPT all lowercase no spaces just that hyphen in the middle between those two
|
||
|
|
words and then you can ask it a question this is a generative model so it's going to only try to
|
||
|
|
complete your sentence it's not going to try to carry on a conversation with you so if you need
|
||
|
|
to create if you're trying to ask a question you probably want to proceed it with the prompt
|
||
|
|
question colon and then ask your question and then probably a new line after your question and
|
||
|
|
then answer colon and then that should give it the hint it needs to try to answer her question
|
||
|
|
another way you can do it is if you're just asking a math question you can just put an equal sign
|
||
|
|
at the end and it'll try to complete the equation so we're going to try to see if GPT 2 can do any kind
|
||
|
|
of math or because large language models are famous for not being able to do math or common sense
|
||
|
|
reasoning which is kind of surprising since they are since computers can do math quite well
|
||
|
|
and they certainly do logic very well as well but large language models are just trying to predict
|
||
|
|
the next word and so you'll see how this one balls on its face when you ask it to do one plus one
|
||
|
|
so if you put in your question a string just just the three characters one plus one and then
|
||
|
|
a fourth character equal sign and put that in quotes then you can you can
|
||
|
|
then do your your generator on that question you put the equal sign at the end and it's sort of
|
||
|
|
like the question mark to a machine and so our least a generative large language model that's
|
||
|
|
just going to try to complete that formula so so then you're going to say your responses equals
|
||
|
|
generator open parenthesis oh yeah I've already said generator equals pipeline so you already
|
||
|
|
got your generator so you're just going to use that function generator open parenthesis and you
|
||
|
|
give it your your string those five characters you just entered or four characters one plus one
|
||
|
|
equals and then and then it will return a bunch of answers you could probably you can set a max
|
||
|
|
length you want it to be bigger than the number of tokens you input and because each one of these
|
||
|
|
characters is an individual token it represents a piece of meaning in that phrase then you're
|
||
|
|
going to have four tokens so you need to give it at least five on your max length parameters you
|
||
|
|
going to say max underscore length equals five or six or seven that'll be it'll just generate
|
||
|
|
enough tokens to to end at that number that you give it there and this is for GPT2 in generative
|
||
|
|
mode and then for the num return sequences you can give it another parameter if you'd like
|
||
|
|
for the number of guesses you would like it to take so the number of times you wanted to try to
|
||
|
|
generate an answer to that question so we gave it the number 10 just to see if it would have any
|
||
|
|
chance of answering the question and when we did that so close your parenthesis now after
|
||
|
|
num return sequences those have underscores between those three words and max length also has an
|
||
|
|
underscore in between those two words and those those are keyword arguments to the generator function
|
||
|
|
and your question is the positional argument at the beginning and then you're good to go with your
|
||
|
|
answers equals that or responses equals that and and so then you can just print out all those responses
|
||
|
|
if you'd like we got so the responses will include both your question and the answer so in our case
|
||
|
|
we got the very first response that we got or generated text we got was one plus one equals two
|
||
|
|
space two space plus so it's going on if you if you if you gave it two extra tokens to go it would
|
||
|
|
have said it would keep going if you get more than two extra tokens so let's see one two three four
|
||
|
|
five if you give it six tokens it would stop at two plus you gave it more than that then it's
|
||
|
|
going to keep going and it's going to say one plus one equals two plus five equals one plus and it
|
||
|
|
keeps going on so it's just trying to complete some sort of equation or system of equations third
|
||
|
|
down the list though we do see an answer that looks a lot closer we see one plus one equals two
|
||
|
|
comma and then it says space equals one and space equals two so it does continue on
|
||
|
|
beyond what looks like an answer and many of the other answers are not even close there's a one
|
||
|
|
plus one equals six times the speed of sound and one plus one equals one comma so out of the 10
|
||
|
|
answers it got one out of tens that'll be a 10% doing its exam and you can't really even count the
|
||
|
|
one that it got right as a right answer because you'd have to be picking choose some of the tokens
|
||
|
|
that are generated to to assume you just make it stop after the first token basically to get a good
|
||
|
|
answer out of it trying a more complicated question where we use that sort of prompting approach
|
||
|
|
where you say answer colon and question equal colon and answer colon and we put questions like
|
||
|
|
in the book natural language processing and action we put the question about cows so if you've got
|
||
|
|
there are two cows and two bulls how many legs are there that was our questions so we put that
|
||
|
|
after the question prompt and then we had answer colon and then we gave it I think we gave it
|
||
|
|
30 tokens or so or as our max length we gave it like 30 so that it could answer that question
|
||
|
|
because there's 25 tokens in there if you look really closely count up all those words and
|
||
|
|
punctuation marks in there you could probably see that it's going to end up with when you include
|
||
|
|
the question and answer prompts it's going to end up being 25 tokens it'll give you that estimate if
|
||
|
|
you give the number two low as a warning saying hey better give me some more tokens that I can't
|
||
|
|
generate what you need but the answer is that we did come up with with that question about cows was
|
||
|
|
we only gave it actually it did a really good job let's see did I tell it to stop
|
||
|
|
well looks like the question and answer prompting gave it a better job when I limited the max length
|
||
|
|
um that this was when I actually let's set it to be smaller than the correct amount so once I got
|
||
|
|
it once I set it smaller to the actual question it only got one out of the 10 right I actually
|
||
|
|
got none of them right because all the numbers it was numbers like four and only a word like only
|
||
|
|
and four and then the digit two and then the number the word one then the digits 30 3 0 and so on
|
||
|
|
didn't do very well when I when I under estimated the number of tokens and then when I gave it more
|
||
|
|
tokens than it needed it gave answers like four f o u r dot and then I carried your turn and then
|
||
|
|
quote let me see if I have this straight so it's going to ask me a question it looks like after
|
||
|
|
giving me the answer for for two cows and two bulls so it's so it doesn't know that it lags
|
||
|
|
or what I'm talking about and not cows male and female because that's what it's counting out when
|
||
|
|
it gets the answer for the second most likely answer was only three and three cows and two bulls
|
||
|
|
are bigger than that answer three is kind of interesting because there are a lot of trick questions
|
||
|
|
that people have been asking chat GPT and that have been including in the training sets that are
|
||
|
|
trying to trick for logic where you've removed the legs of a couple of the cows or bulls and the
|
||
|
|
question so some of them will only have three legs so that three might be wise showing up so high
|
||
|
|
on that list because it's gotten it's memorized some text problems that are trying to fool it
|
||
|
|
but anyway some many of the other answers are all the other answers are incorrect there's a two
|
||
|
|
comma two comma one comma there's one per cow there's a 30 dot there's a one dot that's
|
||
|
|
interesting that that number 30 keeps coming up and one dot and three dot so those are those are
|
||
|
|
periods looking looks like at the end of a sentence so it thinks it's giving me the full answer
|
||
|
|
on some of those and one of them says something like three that has the word three period
|
||
|
|
and then they need to be introduced to the cow population before I wish I'd let that one go on
|
||
|
|
a little bit further anyway you can have some fun playing with larger language models on hugging face
|
||
|
|
they're not going to give you much use unless you get to really good job of prompt engineering
|
||
|
|
and perhaps train them on your kind of problem that you need to solve and that's the kind of thing
|
||
|
|
we're doing over on the carry project an open source project to build a chatbot that you can
|
||
|
|
trust and has a lot of rule-based has a rule-based approach to managing the conversation rather than
|
||
|
|
purely generative so you can keep it grounded in reality anyway hope you I've enjoyed this
|
||
|
|
little my first ever hacker public radio podcast and I hope you have two and Greg do you have any
|
||
|
|
questions or thoughts we spent a lot of time looking at all the different models so it's worth
|
||
|
|
exploring all the different sizes tiny to big and seeing which ones work for your use case
|
||
|
|
indeed yeah that's a really good point we had trouble finding one that was small enough for us
|
||
|
|
to do live on this pair programming that we're working on but so you can and you this was one
|
||
|
|
model out of many many many thousands that you can choose from so have fun searching around on
|
||
|
|
hugging face and find yourself a model you have been listening to hacker public radio at hacker
|
||
|
|
public radio does work today's show was contributed by a hbr listener like yourself if you ever
|
||
|
|
thought of recording podcast and click on our contribute link to find out how easy it means hosting
|
||
|
|
for hbr has been kindly provided by and on host.com the internet archive and our sings.net
|
||
|
|
on the satellite status today's show is released on their creative commons attribution 4.0
|
||
|
|
international license
|