- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
223 lines
17 KiB
Plaintext
223 lines
17 KiB
Plaintext
Episode: 4112
|
|
Title: HPR4112: JSON and VENDORS and AUTH ohh my!
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr4112/hpr4112.mp3
|
|
Transcribed: 2025-10-25 19:44:06
|
|
|
|
---
|
|
|
|
This is Hacker Public Radio episode 4,112 for Tuesday the 7th of May 2024.
|
|
Today's show is entitled, Jason Vendors.
|
|
And off of my, it is the 110th show of operator, and is about 21 minutes long.
|
|
It carries an explicit flag.
|
|
The summary is I talk and rant about Jason and Vendors.
|
|
Hello everyone, and welcome to the episode of Hacker Public Radio with your host operator.
|
|
So I'm making pre-dinner pro tip that if you have EDD, the only way to make a meal is
|
|
to pre-make it, because I can't seem to make a meal exactly when it's supposed to be
|
|
ready.
|
|
Anyways, that's all we're talking about today.
|
|
From about Jason in response to a recent episode, I just listened to you.
|
|
And I've worked with it a lot, right?
|
|
It's pretty much the de facto standard for web application and processing any data.
|
|
It's so prevalent that it's sort of a rant that all of your web apps are essentially
|
|
probably just Jason apps, which is on APIs.
|
|
So I feel like I would say more than half of every web page you're on is probably some
|
|
Jason app or any kind of service-based commercial tool is all Jason.
|
|
The problem with that is that if you're not a programmer and the API that the company
|
|
gives you access to is for the lack of a better term dog shit, because purely companies
|
|
don't have interoperability in mind, because that's how capitalism works and where we
|
|
are in the current software space.
|
|
So if I make a tool, a security tool specifically, or even an IT tool, and it has a feature, and
|
|
someone else has a different tool that has a different feature, they're going to try
|
|
to eat each other at some point in time.
|
|
That's just how software works, or at least if it makes a security software.
|
|
So the problem with that is we brought up a security software and then we said, oh my
|
|
gosh, we have all these tools and they can't talk to each other.
|
|
So we're going to just call it SIM, or I'm sorry, it's SOAR, SIM is a different piece,
|
|
which I believe in, but I don't really believe in SOAR.
|
|
Anyways, sort of a reantone of applications and web applications and vendors, and also
|
|
a call for help on JSON.
|
|
So what I do is I might parse maybe 100 megs of JSON at a time, maybe I'll be working
|
|
with some kind of something that takes JSON as an input.
|
|
For example, it takes a certain type of input, the HDP in a collector.
|
|
So when you send something to Sponk, it through the event collector, it expects it in a
|
|
certain format.
|
|
So you can't just hide from what I can tell sometimes, you can, I don't know.
|
|
But in general, it has to have a specific format with essentially extra headers.
|
|
And what I have a problem with, and if somebody can help me, is analyzing, you know, JSON
|
|
in general, syntax errors, solving syntax errors, key value pair errors, word lists, or
|
|
dictionary errors.
|
|
So if there's a big blob of JSON that I had that's improperly formatted, that maybe it's
|
|
got a new line, and there's no comment in it, or maybe it's, you know, a dictionary inside
|
|
of a dictionary, but the dictionary's not closed out, and you properly each line.
|
|
When a human looks at it, right?
|
|
This might be, it's me answering my own question, in which I can do with, with, with, with
|
|
a large language walls, sometimes can help me parse that stuff out.
|
|
And other tools like online tools, like JSON, beautifier, can help you parse stuff out.
|
|
But what happens is, you know, when I create JSON output, or maybe a piece of software
|
|
creates JSON output, it's standard in that, in that one time that is created.
|
|
But, say I'm doing 10,000 requests, or 10,000 posts, and I'm concatenating them all together
|
|
and trying to encapsulate them properly, how do I do that?
|
|
That scale, idiot proof, what I'm not having to manually, manually process, or manually
|
|
look at each piece of JSON I'm receiving, and I'm sending, and it's getting annoying
|
|
to wear every time I want to work with a JSON, essentially, for the output of any kind
|
|
of way that, I make my own APIs for a lot of security tools, because most of them are
|
|
awful.
|
|
The prop, the reason they're awful, again, is interoperability is just, by default, not
|
|
a thing.
|
|
And then also people abuse APIs, because they're dumb.
|
|
So they'll do something like a, you know, select star, or they'll do a massive, a massive
|
|
query that will break the back end.
|
|
And so that's where you get limitations from every API where, oh, we'll give you the
|
|
first 5,000 results, and after that, you hit the page, and there's probably other tools
|
|
to help me with that too.
|
|
A lot of them have a built into, so they'll send you the URL to the next, for example, Microsoft
|
|
vendor will send you the URL with the cookie, you have the pass to get the next page, if
|
|
you're using your API.
|
|
But the problem with that, again, Microsoft also has APIs that don't exist in their defender
|
|
or API, or graph API, or any or else in their stock, in their second.
|
|
Because they want you using their platform, they want you in their UI, they want you in
|
|
their UI, and not doing the automation, and not using their tool.
|
|
So when there's an API call on the Web UI, it does exactly what I want, and then I go
|
|
to look for that API call, of course, in the official documentation, not only is it there,
|
|
but all the useful APIs queries are beta.
|
|
So you run into this across the board, CrowdStrike, APIs, completely useless, cost $30,000, just
|
|
to get your data in some form, and you can push somewhere else to be filtered.
|
|
So if you're like us, you get 3 million events every two days of people opening PDF documents,
|
|
I don't need that.
|
|
I don't need 50 million heartbeats and 70 million PDF documents opening today.
|
|
Maybe I need a sample of that, maybe I need whatever.
|
|
My ability to filter that out is lacking because I have to create my own APIs, bypassing,
|
|
or including an MFA, or some kind of, my rice, my rice is still hard, so I have brown
|
|
rice, but it's not quite brown rice, anyways, so I make my own APIs, and oftentimes there
|
|
are hecky, but they get the job done, they get the job done, and then when they change
|
|
the UI, of course, sorry, and when they change the UI, of course, everything stops working,
|
|
which, you know, that's not great.
|
|
So the real question is, even if we did cough up the 30-grain to get data to another thing,
|
|
how does it have the same problem in a different platform?
|
|
So, genes and noodle, so, so, you know, things like seeing sign-on, right?
|
|
I want to create a ping authentication, ping authentication plugin, or central module
|
|
for Python, so that I can authenticate any or internal apps and create APIs that have
|
|
MFA, or ping off, attach to, and create APIs for every single thing that I want to, so
|
|
we have hoteling, right?
|
|
That'd be the first use case, it'd be nice to save.
|
|
So hoteling is when you, you know, you get the book of, book of cube and, you know, sometimes
|
|
a circuit, well, I have a proof of concept, API that I built that will let me book my
|
|
chair broadly, or automatically, so I take, I have an array that takes every single Tuesday
|
|
Wednesday of every week, and puts that into a cookie, well, even if I reset what I like
|
|
to do to test my cookies and their links, is I like to create a cookie and try to create
|
|
it with a non-expiring date, if that's possible, and then with the web UI and something like
|
|
works week.
|
|
So I'll create that cookie, and maybe it is a non-expiring cookie, I never, I never
|
|
expires, or I don't get an expiration date.
|
|
Well, what happens after 30 days and two weeks, whatever, a couple of weeks, whatever?
|
|
That cookie tends to die at some point, so anywhere from, I want to say it's a couple
|
|
of weeks to, three weeks, even at every minute, just a hammer away at it, just because I don't
|
|
care, even at that, that speed, every 59 seconds, I still will lose my session.
|
|
So what I want, right, is paying authentication, so I can authenticate to that service using
|
|
regular password credentials without MFA, so I can authenticate to that service and
|
|
automate the whole thing, then the next step would be to do the same thing for CrowdStrike,
|
|
same thing for Spawn, same thing for any kind of web interface like Microsoft.
|
|
So essentially I could do whatever I want to do, that I could do with a browser, I could
|
|
do in my own API.
|
|
So if anybody knows anyway to troubleshoot JSON, outgoing and ingoing, parse it, tell
|
|
you where the problems are, what lines, because I had a 56 meg in-mat conversion from XML
|
|
to JSON, which is a, there's a GitHub project that converts, in-mat JSON to XML, it's kind
|
|
of like extreme parser, extreme in-mat parser, something like that, it eats XML and spits
|
|
out more or less as usable JSON, and that can parse it further through send you to Spawn,
|
|
problem, it's at like 50 meg JSON file, and I post this and last month didn't go through,
|
|
I don't know what, so I go back and I check and I run the commands and export it out,
|
|
we're fine, looks fine, no parsing errors, nothing, nothing unhappy, go to push, do the
|
|
post, no data, error code 8, file, whatever, oh great, my syntax is broken, something's
|
|
broken, the whole script is broken, I don't know what's going on, so I pick one line of
|
|
the JSON, and I push it through, perfect, perfectly fine, no problem, I pick a random shuffle,
|
|
SHUF space dash in 20, and I pick a random 20 lines and push that to, using curl, push
|
|
that to, to Spawn, and this is instantaneous, we're talking within milliseconds, not even
|
|
seconds, did a table of pars 50 megs, works fine, so then I went, okay, that's weird, so
|
|
I'll try to push the whole file, nothing, try to push one file, one line of the file,
|
|
perfectly fine, 10 lines or five lines, whatever I picked, perfectly fine, and then I go to
|
|
push the 50 megs again, and it works perfectly fine, why? I don't know, don't ask me why,
|
|
now I start to second guess myself, now I'll do I need to create a loop that will just
|
|
loop through all 10,000 events, and 10,000 lines, and post them 100,000 lines and post them
|
|
manually, and manually, so I got 10,000 posts to the server, and that's not efficient, nobody
|
|
wants to do that, so now I have to do error checking to make sure that the past, and
|
|
I have some kind of threshold, and then maybe error out to a log file somewhere, like it's
|
|
pushing an email address, which I'm not going to do, I'm just going to check my dates
|
|
on my imports manually, when I go to do scripting and stuff, so all this to say, JSON is annoying,
|
|
it's nice once you get it in Python, and you can manipulate it, you can push and pull
|
|
it, and pull out key pairs, and search strings inside of key pairs, it's a beautiful
|
|
career for doing some of that stuff, but when it comes to a 50 meg file that has syntax
|
|
errors, or you're pushing to, or most I, you have to fire up burpsweet and figure out
|
|
how those requests are coming in, and rewrite the whole thing, and building your own APIs,
|
|
gets difficult because maybe there's a carrier return in there, maybe the API, which I
|
|
have seen, I've seen APIs get angry at new line characters, or new line slash return slash
|
|
in R, I think it a registered nurse, so backslash R is like the return, it's getting
|
|
the backslash in is the new line, so I think it has registered nurse, so I always think
|
|
of that to do new lines, so traditionally Windows Unix, Windows is R, R, M, and then Unix
|
|
is N, so this is a program you can use like DOS to Unix and Unix to DOS, they'll take
|
|
those characters and put them out, and I think it does some of the things like maybe
|
|
a new character in there, some weird Unicode characters will pull out, so it does fix some
|
|
other stuff besides new lines, and carrier returns, but I've seen APIs, you know I'm looking
|
|
at burpsweet, everything looks great, it's beautiful, copy and paste, push the request,
|
|
you know invalid, don't understand, repeat request, works great, copy and paste my request
|
|
to replace what I replayed on the last post, it's error, okay what's going on, I take
|
|
it out of the print mode and I observe that it's one solid JSON line, right, when you
|
|
push that post to the server, it's a solid line, when you look at it in a burpsweet, it's
|
|
all pretty printed for you, and if you're importing new items or you want to change those key
|
|
value pairs or something with the data, and you paste it in, it's got those carrier
|
|
returns in there, so like any software development, and working with any servers or services,
|
|
things can get wonky when you start doing stuff on the back end, writing your own APIs
|
|
and whatever, but yeah, anything having to do with handling JSON, troubleshooting, I think
|
|
beautiful zip, I have some, I'll try to remember if I put it in the show notes, but I have
|
|
some Python functions that are for creating, I've used, I am using to create APIs, and
|
|
I'm trying to write one that will basically emulate Samo, because I don't want to do Samo
|
|
with, at least the way our server uses it, I don't know how to do Samo with Python, so writing
|
|
my own Python Samo module, which is extremely exciting, because it's by encoded weird encrypted
|
|
thing and get a key, and then use that key encrypted or something, it's quite the mess,
|
|
so I'd rather not write my own, but I think that's what I'm going to have to do if I want
|
|
to jump over the inner-operability hurdles that I have with my current setup, so anyways,
|
|
I'm not sure I helped anyone, but I do need help, so if you know any good JSON parsers
|
|
or JSON fixers or ways to handle and or observe cookies, I know burpsweet has a module that
|
|
will basically record much traffic, and then it will tell you which one of those cookies
|
|
that you acquired during the authentication process are valid or useful for the actual authentication.
|
|
So when you log into any website or you use an app on your phone, whatever, you're sending
|
|
tons and tons of data, probably other than authentication, you're probably sending
|
|
probably three to five times and more traffic to the site that you need to actually log in,
|
|
so there is a burpsweet, if I can't remember the name, I haven't been able to get it to
|
|
work, but it's a plugin that's supposed to help you without authentication piece, and that's
|
|
how I would use that easily, more easily create my own, essentially, Samo do it deal. I think
|
|
Samo modules in Python actually don't do what I'm trying to do, they actually require
|
|
like a certificate and to be placed on the ping server, but what I want to use is plain
|
|
text credentials, because I know I can use plain text credentials using burpsweet to log
|
|
in to internal stuff without anything or any of that stuff, so anyways, that helps out.
|
|
If I do end up getting, you know, more stuff, more Python helper scripts, I'll add those
|
|
to the show notes, or I'll do another show probably, but right now it's a lot of beautiful
|
|
suit stuff, it's called fine cookie, little scrape out of the cookies, there's another one that's
|
|
pretty simple, pp simple, that basically will take any input, and output that value.
|
|
And the idea is that it will detect it if it's JSON, boom, it'll pause the JSON, if it's a
|
|
strain, it'll pause the strain, if it's got weird characters in there, like Colens and
|
|
St. Colens, or maybe it's separated by a weird almost JSON thing, it'll kind of make it look
|
|
viewable, so if there's a new line in there, it'll put the new line in.
|
|
So the idea is I want a single purge print function that will take any input regardless
|
|
of whether it's a Boolean or JSON blob or whatever, and it will spit it out, and maybe
|
|
even detect the length or the size of it and say, oh, this is big, so I need to cut the
|
|
data up or summarize it, or so use the end in the beginning, things like that, so eventually
|
|
I want to tune that out so I can easily observe and debug stuff in Python.
|
|
I use an interpreter, which I've been forced to do because the tools I use require the code
|
|
to be scanned by a fancy, wizzy thing, so if I don't use it, then I have to go through
|
|
it manually, clean it up anyways, so anyways, I'll try and post what I've got so far for
|
|
people to use as far as Python, Canadians, and pretty much teach myself Python and try
|
|
to do actual real coding and consolidation. If you have an easy way to do that, I don't
|
|
see there a way to do it, I see it's all manual, which people write in their own security
|
|
modules, it's a really horrible idea, but as far as I can tell, I don't know any way to
|
|
other than like basic inputs like specific UIDs, that email, full number, you know, basic
|
|
stuff like that, I don't see any way to do input validation, which is kind of terrifying, so anyways,
|
|
I appreciate it, and let me know if you have any tips on writing secure Python or modules
|
|
that I can use that will help me idiot proof my way out of writing secure Python.
|
|
Cool, take it easy.
|
|
You have been listening to Hacker Public Radio at Hacker Public Radio does work.
|
|
Today's show was contributed by a HBR listener like yourself. If you ever thought of recording
|
|
podcast, then click on our contribute link to find out how easy it leads.
|
|
Hosting for HBR has been kindly provided by an honesthost.com, the Internet Archive and
|
|
rsync.net. On the Sadois status, today's show is released under Creative Commons,
|
|
Attribution 4.0 International License.
|