Episode: 567 Title: HPR0567: Miscellaneous Radio Theater 4096 2, Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0567/hpr0567.mp3 Transcribed: 2025-10-07 23:15:02 --- MUSIC music music music music music music music music music music music Hello, my name is Sig Fluppen and welcome to the episode of Miss Laney's Radio Theatre 4,096. In this episode we're going to be taking a tour of the Supercomputer Center. Supercomputer Center here in Minneapolis at the U of M. Minneapolis being pretty much home with Supercomputers, Cray, coming from here. So here we are, Walter Library, 117 Pleasant, Supercomputing Institute, it says. Here with a couple of friends, Crypto, Total Blackout, Crypto of DC-612, Total Blackout, representing Minneapolis bin rev, and hopefully more people I've posted about this here and there. We'll see. And we're at the doors. And we're inside. Oh! Oh, I can hear the outside thing. Oh yeah. So, how do you like to be dressed? I'm Brian Roerstelman. So, we expect an in-bale. Oh, Blackout. Show the interest, do you think? Yeah. Hey, I'm Sig Fluppen. Hey. Chris? Yeah, my heart's out. Alright. Come on over here. And I'm looking forward to the stage. Alright, so we're going to show you the data center for the University of Minnesota's Supercomputing Institute. So, we exist on campus to support all the research needs of anybody that's, well, let me bag up. So, we provide this service to any faculty member and an institute of higher education in the state of Minnesota. So, we have four and or their collaborators. So, grad students, postdocs, another faculty member, anywhere in the world. We're just shy of 10,000 accounts total in the LDAP right now. The majority of those accounts, I'd say 85% are right here in the Twin Cities campus. But we do have accounts from all over the world, from all over the country. We are free. So, these researchers that want to have access to big data storage, databases, web servers, big compute clusters, long as they're a faculty member at a higher education, we're free. They just act. It gets reviewed by other faculty members. There's a committee of peer review science kind of thing. It's mostly rubber stamp process. You know, or you want to do big science. Okay, great. That's perfect for MSI. We like to say we focus on the high end. That's very nebulous. High performance computing, what does that mean? You know, this device right here has got the same computational capacity as a crazy supercomputer in 1970s. Is this high performance computing anymore? No, because there's something else. But generally, it's that nebulous vague cloud at the top end of the spectrum. So, that's where we try and focus. So, the meat of what's in here are the big super computers. But we also do a lot of this stuff. We house all the research data. So, you'll see a machine in here. It's 1,000 nodes, 8,000 cores, all one big machine. We've got some astronomy books who've used that. The entirety of the machine. One simulation generates 20 terabytes of data. Well, now they want to do something. That's just the results of the run. You know, they have no idea what it looks like. They have no idea what it is. They want to make pictures. They want to make movies. They want to otherwise analyze the data. So, you've got to get everything out. You've got to store the data someplace. So, we provide all that infrastructure as well. And our networking core networking in this room is all 10 gig. We're directly tapped into the use 10 gig core. We can also bypass the university's tire network and go straight into the regional optical network called Boris Net at 10 gig speeds. Boris Net connects us down in Chicago to the National Labyrinth Rail, which you could think of as Internet 3. Although the fourth generation of that's nothing. We can also then tap into international high-speed networks like Starlight. Or we can go to the west coast. We can hit Seattle and bounce off to the Pacific Rim. All at native 10 gig speeds. So, we can have 10 gig pipes from a machine here to a machine in Korea. And push all that data around. Which is really important because our people want to do that because they have collaborators all over the world. So, that's why we're here. Now, half the staff are mine. It's all the systems administrators, systems programmers, database people, web developers. The systems administrators are actually split into traditional infrastructure folk and then the big supercomputing systems. I actually do have a desktop support group. We have some weirdos who use Windows and a lot of other smart people who use Macs. And the rest of us, the smart ones, really smart ones, all use Linux. But for all the staff, of course we manage all our desktop. We're a little different than most of our peers at the national level because we've got these laboratories. There's two in this building. There's one on the first floor that's for visualization. There's one up on the fifth floor which is kind of a more traditional bunch of computers in a room that you can just sit and use in a software. We use it for teaching. It's used for experimental things. We're working with hardcore computer down in Rochester. They're liquid immersive systems. You know, run them up at 4.5 gigahertz. They do everything. They overclock the memory. They're overclocking the GPUs. And so we're getting several of those for evaluation, playing, and I mean evaluation. We've also got labs. It's out of Washington Avenue and the academic health center area. We've got a lab over at the St. Paul campus. That was actually two on the south side of Washington. So it's a physical space where a user can go sit down and directly interact with all their data at MSS. Because we manage those machines. They directly mount all our file systems. They can print to our printers. They can use all our software, which is a very beneficial experience, right? I mean, you can sit at a Linux box. You can remote connect. You can pump X sessions back and forth. You can go out and stuff. But it's always better to just be sitting at the real thing. So we've got that. We've got the other half of the organization is Port. That's a bunch of PhD folks in science specific areas, primarily in the life sciences. We've got geneticists, bioinformatics, kind of folk, proteomics, that kind of stuff. But we've also got traditional folks. We've got a couple of math PhDs for focusing on statistical analysis, data mining. We actually don't have any computer science PhDs, but we're nothing. Yeah, really. We've got two folks from Master of Physics. But they are primarily our generic HPC. This is all in the support group. It's all in the support group. My staff, I think one of my managers has a master's. So the rest of them are all undergrad. And those guys, some of them come from computer science. Some go. I mean, frankly, some of the best computer folks in a polyside degree, actually. And frankly, I do a lot of interviewing. I look at a lot of resumes. And we'll put out a job in for a systems administrator. And I get the recent graduate students from computer science, university, Minnesota. All they've ever done is programed individual basic. And maybe some access databases. I've never touched a command line. They know you know how to spell BI. They've never seen EMACs. They haven't got a clue. I mean, so it's tough. Do you try to put a computer side people and be like, hey, I can't use any of these. Wow, yeah. It's a big debate at the national level, frankly, about the curriculum. Right? I mean, what we're doing in here has always trickled them. Right? What makes a supercomputer super is just that it's bigger and faster. But it's only a bigger and faster for a little while. And then it's a gaming machine. And then it's your desktop. And then it's in a laptop. And then it's in the phone. Right? It just takes a matter of time to get from here or there. So I'm going to go one more sweep for that last guy. Sure. I'll be right back. Hold the door. Because if you push the brake bar, it will sound alarm. Oh. Hold up. We'll wait here. We'll wait here for you. So just nod. Okay. One big difference though is the multicore aspects. Well, and that's, yeah. And so that's the issue, right? So historically, we've seen the gains from Intel. Thank you very much. Right? They just make the clock. The two gigahertz. Two gigahertz. You've doubled your performance. Simple. Right? But Moore's law. You know Moore's law. Right? So the real Moore's law is actually the number of transistors in an area of perfect dollar. Which roughly translates into the performance. Right? We're still on Moore's curve. Well, Moore's flat line. Increasing our performance if you transmitted that. As well as the original statement of transistors per area for dollar. But yeah, we're no longer getting the clock increases. And that's a direct result of leakage current. Man. So leakage current means more power. Right? So we make these transistors the more they actually leak. And the more power you've got to have. Not only because you got more transistors, but they're not as efficient. Actually, they're leaking water. So I'm sure you guys have built machines. You know the thermal envelope of a regular Z on processor in 120, 130 watts. It's a light bulb. You burn your hand to touch those things. So they've capped it. Right? And it tells them we're not going to go any higher than 130 watts. It's just getting ridiculous. So now they go multi-core. And yes, here comes the problem. We've seen this for years. You know, four or five years ago we started talking about. We got to change the academic curriculum of the university. It's because people need to start going back to p-threads and shmams. And multi-threaded programming. We've got to do it. Right? It's hard. Well, yeah, it's hard. You know, of course it's hard. Assembly was hard. But you got to know how to do it. Somebody's got to know how to do it. Because that's where you're going to get the performance, right? So you're going to have to go to a hybrid model where you're still going to have distributed machines and you're communicating with MPI. But then you're going to have these big, fat nodes. That's a local device now that you've got to start doing through it and have locations. I forgot who it was. I think it was what's her name, Fran, the person who wrote Cobal. They had mentioned that there are three general rules when it comes to speed. Well, performance, one being the architecture, the other being the compiler, and the third being the algorithm. It seems like we've done what we can with the first two. Now, the algorithm is the hardest part. Well, the algorithm is tough because, yes, you know, and this is a special area of computational sciences. It focuses on the algorithm, right? And it's always been on parallelizing the algorithm, whether that's for a distributed computing model or a threaded computing model, somewhat irrelevant, right? But somebody's got to figure out a way to tackle taking the serial thing and making it a parallel. That fraction that you cannot parallelize kills you. And even if you can parallelize 95% of the code, the best you'll ever get is a 20% speed up. You can't get any faster because of that 5% that has to be serial. So, yeah, so we've got to have threaded applications. We've got these multi-core pieces that are just going to get bigger and worse from one perspective. And again, you've got a graduate student applying for a job and they know how to write a single threaded visual-based kind. They can talk to an access database through a DBC. Great, it's not going to help. It's not going to help that easier, but real quick. Does anyone have any problems with anything including your pictures being taken? Oh, that's fine. No, okay. No. Okay. That's making sure. Of what? Of us? Well, yeah. Yeah, I suppose about. Yeah, that's nothing. All right, now I'm missing it. I'm in DEF CON mode and I'm like, I'm used to, well, I'm going to make sure that people don't mind their pictures being taken. That's fine. That's fine. Well, I'll give you the finger when you do. That's what you have to do. That's what you have to do. No, not record. So that also explains why I can tell us when we do the same thing. All right. All right. That's coming for a slightly different reason, but yeah. So since about 2000 and- Was that in the shot? No. I should have held both three. I can't remember the person. It could have been fall in 2002 or maybe spring in 2003. Intel starts holding what they call an HPC round to twice a year, once in the spring, once in the fall. The invite only is, I go, it's folks like me at the National, other national centers. And they give us about a seven-year view of what they're doing with the processors. Seven years gets kind of fuzzy. Five years they've been pretty darn accurate with so far. Three years, they're already making them. They might still tweak a few things with it. But they tell us what's coming. And then they say, in this section of the die is left over transistors. We got space. What do you guys want us to do with it? So some of the SSE stuff that you've seen? Look at SSE 4, 4.1. The new AES encryption and stuff. It's on. Anything vectorized going back to the 1970s when Kray was doing it, we're throwing it in there because it's being used in these machines. So we've had some influence on the architecture. The QPI, you know, they're answering the hyper transport, moving to an optical exchange. So latency is more of the issue than bandwidth. So 50 megabits a second. That's a really cool wall. Show you an hour is in here. They're 80 gigabits a second. 40 bi-directional. That's fine. So having that in silicon and having that within a board, that's going to be kind of cool too. But it's more latency. Here's their sharing memory effectively. So here's the classic question. What is this? It's a nanosecond. Bingo. That's 30 centimeters. That's a nanosecond, right? So you start talking about bigger dies because you're putting more cores on them. And I'm not sure it's no surprise. They've already got roadmaps for their 12 nanometer fabrication techniques. They've just released their 45s and 32s and 22s mags and 12s coming and they're looking at eight. So you're going to have, and we're up to what now, three billion transistors, something like that, on a typical Nihilum 6-core West New York processor. The dies are going to get bigger. So what we're calling a processor these days with all the cores inside it is physically going to get larger. You're still going to have multiple sockets. And you still got to have memory. And you're already beyond the nanosecond. And that's a big thing. That's a really big thing. They're going to be able to push the clocks a little bit higher incrementally as we go along. So instead of hovering around the 2.2s, 2.8s and stuff like that, we'll get back into the 3s, the low 3s, even with these nine core nodes. But now you've got a bus out there somewhere too. And it gets really important to have as low of latency transport as possible. The bandwidth isn't the problem. It's a latency. Let's go on in. All right. So you guys have been in data centers before? Yeah. One thing you'll notice here is we run a little hot. So the building was, the state put about $63 million in 19, 99, I guess, 2,000 somewhere around there into restoring this building, the library, and renovating this space. MSI used to be across the river. There's actually still a sign. If you're in Washington Avenue, just after you go across 35, the west side of 35, there's a sign on the northwest side there. This is Minnesota Supercomputing Center. So back, we're 25 years old. So back in the mid 80s, when some of the faculty got this started, we were one of the first, well, we were the first university to buy a cray machine. And they realized that it's really hard to operate these things and make them run and everything else. So the university spun off a company called Minnesota Supercomputing Center, Appropriation MSC. And then the university kept the Minnesota Supercomputing Institute, which leased computer time from MSC. So we would do it with the researchers. We'd handle the allocations of things and manage things with MSC. Well, when this building became available, we'd already made the decision to start running our own machines. And so we had some stuff to move over here. There was a bunch of IBM SPs. That was the big machine. Deep blue. The one that deep casted out was an SP. So we had about 20 frames left that we're going to bring over here. And that was it. We had a few other machines that were literally sitting on the floor at times. You know, racks were kind of coming in the boat. Earlier to midnight, when all this stuff was acquired, and we moved in here in 2000. It's about 3,600 square feet of space, and I thought, you know, 375,000 watts of power, I'll never use that. 80 tons of air cooling. Keep going forever. 640k of RAM was enough. Yeah, 640, thank you Bill. So yeah, we're... 640kW is enough for me. So to put the latest machine in, I had to add another 500 kilowatts. Now the total power in those rooms, one and a quarter megawatts. I still only have the 80 tons of air cooling. It's in the subbasement below us. There are some chillers that they put in. All the chilling towers are, of course, outside. I get chilled water that runs through here. That's all that these things can handle. We also have a condom as we're cooling in the winter. So we'll suck in air from the outside or run through filters. It gets pushed into the air handlers and then the handles the same thing. But with fans sucking in, and fans in the air handlers, and fans blowing up under the plenum, and then fans in the returns, there's been times when there's a vacuum seal. You can't get the door open. You leave, and the door is sitting here floating because the air pressure sits over-pressurized. You don't have the fluid dynamics. People help you out off that. A little off. Yeah, we're in. Wind sucks all over the... Well, we've actually analyzed at the depth. We have done that. There's a bunch of structural issues in here. We can't put nice plenums and duct-unnery turns on the ceiling because of the ribbing from the floors above us, and fire hazards, and sprinklers, and all these other things. So with them with a little bit warmer, then you can normally get light. Of course, Intel is pushing that you can run data centers at 90 degrees. 80 degrees Fahrenheit. They stand behind them. You know what I'm saying? There's no problems with that. As long as you actually get flow to keep the heat moving off of the veins, you know, the cooling veins that are on the top of the processor, that's okay. That's awful. You know, and I don't know why all the other components. Intel saying, yeah, am I still working? But there's a lot more in a system, of course, in the processor. So... But we've gone to water cooling. Direct water... Well, semi-direct water cooling. We have water cooled doors on the back of these things. So we're still doing a classic suck in the front hole that you didn't love hot in the back. But my hot idle is now cold again because I've got chillers right there at the back of the racks cooling that hot air. Well, and in one case, you'll see it's totally self-intensive. It's like the ABC solution. This is actually from the top, it's the side racks. You see a lot of the front of the back. They don't actually suck here and just circulates around inside. Is that pretty good for me? Yeah, it works not just fine. It's literally getting away. You'll feel safe. We're entering two giant doors, one of which says high voltage all over the place. There's a picture of Elemona's fun. Well, about halfway down, you can see the dark gray with the silver front. That's actually a stupid thing to do. But the rest of this is our cutting infrastructure. It's for the most part. Yeah, but I mentioned we do welding the goofy windows back then going on. Most of the core standard unit infrastructure, file print database, a little big storage. Well, we name our systems after late, not Sesame Street characters. This is late Elmo. These are just some big fat nodes. No real high speed interconnect. It's very good for a MATLAB. I believe it or not, people actually consider that a programming language rather than an application. Yeah, there's some statistical stuff are a lot of genome secrets in happens here. There's 128 giga-rand machines. So single thread and a bunch of memories come to mind. We do tinker around with the dark side. All right, can you turn up please? I don't know. I think the enough G-I knows actually these two nodes to be a total board in there. I thought you were going to say that you crack a password cache or something like that by taking around with the dark side. I don't really care how good that is. I don't like to talk too much. There's 80 cores in there and frankly it actually kind of works. Yeah, very much like HPC, like actually something they did. Yeah, some are coming at me. They've done a very good job of stealing smart people. So they've got a real good set of folks working on this. And they're turning on a pretty decent product. Still doesn't have that much to do with it. So it's a big stock. Yeah, yeah. I'm sure you guys have seen blades. Never seen one in person. Performance, it says here, 6.22 Teraflops. They're reticle. Simple concept. Take the one new piece of box with everything in it. Take all the common crap out. Put it in the blade chassis. Now the blade is just a huge memory, basically. Frankly, IBM has been doing this concept for about 12 years. I mean, you can buy one of the high-end models a long time ago. It's called the IBM S80. And you buy this huge half rack of stuff. And it was nothing that brought this just a memory. And that's it. Nothing else. Some cable ports in the back were what they called Rio, remote I.O. And then you bought another big chunk. And it was all the PCI buses. That's where you put all your adapters and everything else. So IBM has been splitting, crock, and memory from the rest of the system for a long time. And this is just a natural evolution of that. So the blade, your nose, your action runs the OS, has no ethernet, has no invented van, has nothing else. It gets all that from the chassis. Right? It's got a special passive connected to the chassis. And now we're all the networking takes place. So we wire up to the back of the chassis, right here in the back of the invented van, and there you go. So these are old, bought four years, dual socket, dual core AMDs. But the machine runs fine. It's got 1100 quarters total. The history behind all this, we've got it. So the history behind all this is literally rocket science. So in the very early 90s, a gentleman by the name of Tom Sterling and his friend Donald Becker, worked for NASA. They were asked to be on a team that was going to verify a new rocket engine. They said, sure, we'll do that. And so these submitted their budget requests and the budget included a couple million dollars for the latest reader's supercomputer of the day so they can do the analysis. And that's the, hey, great budget plan, but we're not giving you money for the computer. But you still got to do this analysis. They literally went out and they scavenged 16 old 486 machines and created the first commodity cluster super computer that they named Veilwell. So I actually helped hire Tom Sterling away from NASA to Louisiana State University about five years ago. And I asked him, how did you come up with that name? He said he was literally at his mom's house sitting down in the rocking chair trying to think of what he should call this machine and saw her old copy of Veilwell from a foot show, and it just stuck. And now Veilwell is a generic class of system, right, besides the proper noun. That concept has simply evolved, right? A bunch of distributed machines all run on their own copy of an operating system participating in a single application. Communicating over what's hopefully a really good network, right, for passing data, passing messages and stuff. If you look at the top 500, which is a website, top500.org, tracks will pass the 500 machines in the world. And in 2000, I think there were two Linux clusters on that list. Today, I want to say that 95% are Linux clusters. What's the most supported distributed computing operating system? I've been looking at some of them, and I see there's support stock years ago for quite a few of the distributions. Well, the most passive. Which Linux you meet? Yeah. So the answer is Linux. The question or the, which specific distribution? I wouldn't say there is one. People build it with everything, right? Do it with devian. You can do it in Gen 2. You can do it with, I mean, these are all running suits. Some people run rails. Some people just do sent us. The scientific limit, Linux, which is special patches, or sent us, I mean, runs the gambit. People do it bunch of clusters. It doesn't matter. It depends. So that guy down there is just an FNUS. Playing in simple. This guy here is running Luster for a tight speed file system. Calhoun on the other side is running Pan FS, the Panassus file system. We experimented on this guy for a while with Luster. Luster starts with a G. PVFS, PVFS 2. So CXFS, the cluster XFS file system from Silicon Graphics. GTFS from IBM, the global parallel file system. So any distributed parallel file system is typically what you're going to find on these clusters. But sometimes just plain Jane NFS works just fine. You know, especially if you got nice, fat, and get bites. So I was mentioning the 80 gigabit network. Right here. It's in Finnavand. It's QDR in Finnavand. So again, each channel is 40 gigabits. But there's a dual channel, 80 gigabits. One microsecond latency, no to no. Zero bite payload. Real payload you'll see two to three microseconds. It's really big payload, gigabit four, five, six microseconds. But again, this is the magic load. This is what makes, this is the secret sauce. Right? This was the message passing interface, the MPI library. So back in the day, on the first machine call, Farewell. They used what's called PBF, parallel virtual machine. And it basically took the 16 nodes and made it look like one. But it wasn't really. They had the fact that it was the 16 obviously. So from the user's perspective, you could do an LS. You see the process. You did all 16 machines. PBF, PBM was going out, doing everything you needed to do. I said LS, I'm FPS. Right? So. But there were no standards. There was a research project at Carnegie Mellon. And they grabbed it, they made it work on mail. Well, another people took it and made it work. And there were no standards. Everybody realizes crazy. So the good folks that are out on national labs came up with a standard. They formed committees of academics. It took them years. But they came up with an MPI standard. And then Argonne also wrote, the definitive reference implementation. It's called MPH. So MPH installed on your system with an application written against an MPI library. Right? You can do C code, 4K code. There's pearl bindings. There's Python bindings. There's PHE bindings. There's less PHE bindings. There's all kinds of bindings to MPH lives from always different languages. You can launch an application with a resource manager across all these nodes, hire them up. And I'll start to coordinate, communicate, work internal, however you did, whatever the algorithm is. So one thing I wanted to bring up on what you said is second of those three, the compiler is still a huge issue. A huge issue, right? Because you got all these dumb computer science people out there. And if there was bottom magic parallelization, we'd all benefit, wouldn't we? So the compiler is huge. And there are, you know, many PHE fees still out there are being written around automake and auto-parallel utilizing the compiler or solving branch problems in the area. But this is our latest. This machine gave you, like, God, I should know, like mid-50s on the top 500 less. It's now somewhere in the 70s or 80s. So a fairly fast machine. The Calhoun machine over in the corner actually debuted at number 47. And the fall of 2007, it's not on the list today. It just fell off. It's too slow. I won't even make it. So things happen fast. They change quickly. You think Pascal might be a good language to auto-parallelize with the compiler? You know I haven't done enough on Pascal. I don't know. Here's the Haskell or Hascat? Hascat. I've personally not touched Haskell. I know some people who swear by it. And there's some more people. So maybe it would. You don't actually write the procedure. You can pilot generate the procedure. You just define the result. So if you're writing auto-parallelizing with the compiler, I think that would be a good language. It could. Yeah, very well said. So you haven't heard much of that used in these systems? No, not on these systems. No, again, it could be some research project somewhere else. Thank you again. For myself, I haven't used Haskell myself either. But I have heard it's unrelated to parallel processing along. So apparently there's a lot of research in there. Yeah. That's interesting. Yeah. It's on the cooling, right? So here's the refrigerator for this set of nodes. You see the hole? But it's fine, right? It's fine. Flexiglass. Flexiglass. Weather. Drifting. Keep on the air. This is where the battery on my recorder died, unfortunately. So you can't listen to the remainder of the tour, unfortunately. It was a lot of fun. There are a couple of other computers there. And I forgot to mention the one that we got cut off of. I believe the paper performance was 94 terraflops. And so what we did after that is we went upstairs where they had this visualization room, which was right awesome. Really cool. They had three projectors sort of projecting onto three services that surround you. And there's a sort of flickering back and forth with LCD shutter glasses. And the glasses you wear have a tracker on them. And so it's pretty much virtual reality, right? It's a 3D picture shown on these three screens from your perspective as you move. Which was hella fun. Our tour guide was primarily talking about how it is used primarily at the very moment for visualization of hearts. And all these medical things and veins and whatnot. What we did get to play with was this 3D paint program, which was really cool. And completely forgetting the name of the library. I don't know if this is the library they developed for it, the 3D library, or the name of the program that we used. It was right awesome. You held this little wand and you painted in 3D space. And so we had a great deal of fun. We, I believe, Zach, member there representing 2600, the 2600 meetings here. I meant 2600. Ask. What did he ask? I think it was something like Unreal Tournament, yes or no. Or something like that. Like, oh well. Well, we don't run this game, but we run this game. It's kind of difficult, though, because most games aren't meant for three screens. So we have to have a couple of players that sort of follow the other player. And these perspectives are moved in the proper rotation as the screens are oriented. So I thought it was kind of funny. They admit to playing 3D shooters on their big screens. But that's it. Thanks for listening. And hopefully we can do this again. Bye-bye. Thank you for listening to Hacker Public Radio. HPR is sponsored by Carol.net. She'll head on over to C-A-R-O dot N-E-T for all of her screenings. Thank you very much. Thank you.