s02e01 – Zach Cardwell, AI/ML

Show Notes

Big Cheese Podcast Show Notes - Episode 1, Season 2

Topics Covered:

Introduction to Season 2
Partnership between The Atlantic and OpenAI
Subscription models and the future of journalism with AI
The release of OpenAI's GPT-4 model and its features
Apple's WWDC anticipation and their developments in AI
The Rabbit R1 and Humane products
Opportunities for AI beyond large language models
Data science, LLMs, accuracy, and plausibility
The three body problem and disturbances in machine learning systems
The impact of AI on critical thinking

Relevant Quotes:

"We're going to be talking a lot about data science, LLMs, and demystifying some of the stuff and maybe providing a few nuggets you might not have already heard."
On The Atlantic and OpenAI partnership: "It's like Google's product exposure, right? If you have a shop, you can open up an API, and then you can actually buy the product from Google search pages."
On Apple's WWDC: "Everybody's waiting to figure out what Apple's going to do with AI."
Regarding data in LLMs: "So I mean, I don't see the model itself. But if I had to make a guess, it would be something along the lines of citation networks about where does this come from? And if it's reputable or not."
On the importance of critical thinking: "I feel like these models will encourage people to be trained by them unintentionally to be critical thinkers about what's going on."
On the three body problem and AI: "What is to say that introducing a slight disturbance in the force can't bring it all down?"

Stay tuned for more exciting guests and discussions as season two unfolds! If you want to learn more about Zach Cardwell and his data science expertise, visit data317.com. See you next week for another episode of the Big Cheese Podcast.

[silence] Welcome back to the Big Cheese Podcast. I am Sean Heist. Today we have a special guest, Zach Cardwell, from Data317. And with us as always, Brandon Corbin and Jacob Wise. So we're really excited. Thanks everyone who watched season one. This is now season two. And we're gonna have some really awesome guests this season and we're just really excited to be back in here. I don't know how long has it been. It's been a month. It's been a month. It was only supposed to be two weeks because I was gone for two weeks and then we came back, you were gone for a week. Then we came back and then Dylan was flying off to France or something like that. Yeah, Dylan is, you don't see him on camera but he's our producer, he's the man. So we thank you for letting us come back for another season. Yeah, so we did season one for 26 straight weeks. Never missed a single week, not even during the holidays. And even though we didn't have 26 episodes. No, no, it was then it was more. It was 28 or nine. Oh yeah, it was 28 or nine, yeah. So today we're gonna be talking a lot about data science, LLMs, Zach's got a lot of expertise. He's working on this arena for a long time and kind of demystifying some of the stuff and maybe providing a few nuggets. You might not have already heard. Yeah, I'll try. But first and foremost, we've got to talk about some news topics. And we're gonna kick it off with number one, which is a very interesting development, something we've been talking a lot about when it comes to journalism. And LLMs are specifically, obviously, chat GPT and OpenAI. So Jacob, tell us about the Atlantic partnership with OpenAI and what that was all about. Yeah, so the Atlantic has partnered with OpenAI and they're basically gonna give them access to their data, which is interesting because as you go, I'll probably know the New York Times is actively suing OpenAI because they're using their data without asking permission. And the Atlantic's hoping that they can, oh, first of all, for my research, I found out that they just became profitable. So journalism is extremely hard business, I guess, to make money in. That makes sense. But they're hoping to do sort of a product partnership with OpenAI to give them access to their data and then have it within the product. When you search for things, the Atlantic's content would be surfaced up to the top. It would be given attribution. And then hopefully that translates to people going to the website and subscribing to the Atlantic for the premium content. It reminds me of, what was the, oh my gosh, I'm totally gonna blink, what was the LLM service that was kind of trying to do something like this before? I'll remember it later. Not Victora. No, but the founder came on and their big thing was attribution, but the founder came on and was like, "Oh, maybe there's a model where we could surface "subscribing to the provider within the app." And I'm like, "Well, yeah, that's a great idea." It's like Google's product exposure, right? Like if you have a shop, you can open up an API and then you can actually buy the product from Google search pages. So I imagine it'd have to be something like that where it's like the long-term play is gross subscription through surfacing their data and they're the ones that are getting the attribution. So, anyway, I don't know what you guys think about all that. Is this fool's gold or like, is this a thing of the future? So one of the things that we, and we talked about it before was that if this is kind of a subscription model, so like I could go and I've chatted GPT and I could be like, "Hey, I wanna subscribe to the Atlantic. "I wanna subscribe to the New York Times "to these things, maybe I'm paying a buck a month," whatever it is. The question though then becomes, is that data actually part of the training data or is it now just another rag, retrieve logman generation layer on top? And that's the piece that I'm not clear about and I don't know if they talked about it at all in the paper that you read if they planned on just now. It was pretty vague, but yeah, I mean, yeah. I guess like every single provider could do that themselves, right? You used to have the thing where you could add Google search to your website, now you can just do site and then within Google. But I mean, I guess the goal would be something like that or even if it's not subscribing through chat, GPT is like getting that funnel, getting those people to the source eventually. Well, I also wonder is are we gonna start seeing where more and more people are gonna basically just release an API that's their rag? Like here's our endpoint that you can then pull in through an action of a GPT or you can do it where you just pass it a prompt and then you get the results back and then you can consume it however you want. Yeah, so this is a question now of ignorance more than anything, but has that actually ever worked? Like if we think about like the business standpoint and that sort of thing, remove the LLM, remove the AI, remove that, I know this is AI podcast, but still, you know, when you had these services that would reveal like news articles and that sort of thing, you know, did anyone actually ever make any money off of that? And is that actually? Well, I, too come to mind and I think it's a great question, is medium when they basically went paid, you know? And so now they're still around and the content quality on there is still pretty good. I couldn't tell you, because the first two senses before the overlay, oh my God. But the other one that comes to mind is Apple News. And I bet that's first came out. So that's the one that I'm aware of and I was listening to either PIVA or something else before. And the big thing with Apple News is Wall Street Journal, right, you get Apple One or whatever the subscription is called and you're going to be able to see the Wall Street Journal. And my recollection, and it could be wrong, my recollection is that it didn't translate to new subscribers or anything. No, and I think that it comes down to layers of abstraction of cost, right? So like if I pay you a dollar, but you know, you gotta pay taxes on that dollar, you gotta pay the Apple tax, now you gotta pay the chat, GPT tax. And you're trying to just the cost of distribution. The sheer cost of distribution to get this large audience is almost, it's just, it's hard, right? And basically people used to, you know, get a bill in the mail for their indie star or for their New York Times once a year and they'd pay it and they'd get it delivered to their door. What's your cost there? It's very little. And it's just the cost of distribution is so high. And I wonder for at the Atlantic, if you're just making a play to do something because the cool kid in the room is asked you for a meeting and you know, you got Sam Altman knocking on the door. You know what I mean? I was like, is that a good business decision or is it a cool business decision? So the subscription, so I think there's two things here. There's a subscription model that we're kind of talking about which is a completely new kind of business model when you're dealing with like I'm interacting with a large language model and I've got subscriptions or something. But in terms of just being like, hey Sam, I've got a bunch of exclusive data that nobody else does. You wanna pay me $5 million a year and I'll give you action. Which is kind of what Reddit's doing. Exactly. Yeah, Reddit just was like, yep, okay cool. And then everybody's up in arms about that. That's what's gonna happen is everybody who's got these platforms is just gonna see the writing on the wall and be like, all right, I can get 5 million, whatever it is. I don't even know what the numbers are. And open AI is just thirsty for data, right? They're out, right? There's nothing else left, right? Except for, you know, we talked about our Zoom meetings and all that, we'll get into that. But like they need more data and people are relying on their tool for probably more than it was designed to handle which is like really real-time information to become in the next Google. And you know, I never underestimate that type of company that has delusions of grandeur, right? They're gonna wanna do anything and everything to grow, to grow their user base. Is it at the sacrifice of even the core product? Yeah, no, I kinda wonder like it's, you know, do you, does the Atlantic have staying power? Like do you go back 20 years or something like that? And is there value to that article itself, you know, to keep back from open AI? Or is it really better to just get the cash or something that you're not gonna make money off of anyway? I don't know. - Right. But it's very interesting. Yeah, well, I kinda see, I think we can kind of skip into one of our sub topics, honestly, from this and come back to one of the news. But one of the things that we wanna talk about is like the difference between data in an LLM. So like is the, is for, as a data science expert, is an Atlantic article weighted differently than some random thing that they scoured through the internet that they don't even know where it came from? Or like how do they, how does that work in a neural network? We have all these, just all these abstracted data points. Like are they literally gonna surface the Atlantic higher and give it more weight? So, I mean, I don't see the model itself. But if I had to make a guess, you know, what they are going to do is something along the lines of citation networks about where does this come from? And if it's reputable or not, we see this with Google, we were talking a little bit earlier about PageRank and that sort of thing. And you can see that metric coming in and playing some of the same sorts of games that go in with LLMs and that sort of thing in terms of the answers that are going to be surface in that. But then that just opens up a whole other can of worms. I mean, are you overweighting the neural network? Are you doing something else to it in order to do that? And does that undercut the fundamental value? Right, the fundamental value is not necessarily the human weighting of it, right? It's the, it's the natural weight of the dating. Yeah, of the data of the dating. Well, I love the natural weight in my dating. It's usually pretty high. Yeah, and then does it just eventually become this gross pay to play platform? Like, you know, we, I don't know if I want to throw Angie's list under the bus, but we talked about that a few episodes ago where it's like, it used to be the best contractors were the ones at the top because people genuinely were like, they're the best, but now you pay to play and it's like, it's unreliable. So I worked in the SEO space, if I could speak, the SEO space for a bit. And you know, the most interesting thing about that space is that everyone has like a hypothesis about how this stuff works. Like, oh, if we, if we do this, we'll get more views or whatnot or have more click-throughs or blah, blah, blah, blah. And then it's come out recently in the news that Google wasn't exactly forthcoming about what actually did contribute to the algorithm and all that. And we're opening that up all over again with this. It will be even more abstract. And the mystery is Sam Altman's power, is it not? And as long as they're growing their user base, this is becoming what we theorized in season one, which is their model, remember I said, their models is what's gonna keep them popular, but then the model's gonna become ubiquitous. But really, it's become actually the opposite. It's the user base is creating the value and the power. And obviously they're coming out with more models and stuff, but as they grow their user base that they grow their network effect. Can we talk about 4.0 for a second? Sure. So get, give your, so. What, what? Okay, so if you didn't hear, we've a big announcement. They had their 4.0 release. Everybody's expecting it to be five, wasn't five. Come out, it's 4.0, not 4.0. Four lowercase O as in omelet, right? Our omni is really what I think it's about. So whoever names these things at OpenAI needs to be fired. In my humble opinion, it probably hired the Microsoft Xbox people. - It's from Ono, but my new name. It is, it's gotta be somebody from Microsoft. 'Cause Microsoft has historically been some of the worst brand naming company. I love my Xbox Series X. (laughing) So anyway, the model comes out. I'm still waiting, the one I'm most excited about is the latency-free chat, voice chat, that you get with the app. And I mean, you can interrupt it, which is another thing that I've always struggled with is like, no, stop, so you can now do this interruption. It's ruling out for people who are on the paid plan, and apparently will be available for everybody. So explain the latency-free piece. So basically, if you have ever used like chatting with an AI model, like voice chat, you say, hey, tell me a joke. And then it's like, bloop, bloop, bloop, bloop, bloop, bloop. Hey, why don't scientists like atoms because they make up everything, right? And it's like that same stupid-ass joke it says every time. But basically, it now is instantaneous. And it will be like, hey, tell me a joke, and be like, oh, okay, well, and what you can see is they actually add some filler, like filler-teaser-teaser. It's just a better skeleton-loading state. Right, and that's a good, it's like- And the hell are you like trick? That's why they went from a new electric trick. But apparently though, there is like, it is streaming your input as it's, so it is trying not to just be like, boom, chunk, weight, chunk. So that's really cool. It also opens up the four model for all the free users too. So if you have been a free user, you only had access to three points. Does that give you the dolly and all that then too? That actually is a good question. I don't know if- 'Cause with four, that was kind of the upgrade. Oh, you got four, I got a dolly. Yeah, that's a good question. I don't know if you actually get it with the free version. I would assume so, like they have to, because like meta.ai has, and theirs is actually really good. I was playing around with it. It does better with text than dolly. What, have you seen the translation one? Is that the, is that- Yeah, yeah, yeah. That was the one too. They were doing the voice for Italian. They were said, okay, every time you hear English, I want you to translate it into Italian. And when you hear Italian, I want you to translate it into English. She's like, oh, okay, that's a whole other conversation. So the point I'm feeling though is the really, probably the, maybe the tipping point with this stuff really becoming used in a voice interface. Which probably segues us to the Apple. Oh yeah. Or Nvidia stock going up 20%, 'cause I'm sure this shit is like super. Yeah. Like how much? Yeah, yeah. We talked about it, we talked about it before. The concept of the neural network, the concept of the way that humans act, and the way that I'm just blabbering out stuff, and no one even knows what's coming out, it's just here, it's just going, right? Like that whole thing, right? That is how humans work. And traditionally, a chat-based interface doesn't work that way. It's clunky. If you go, hey, Siri, what time is up? You know, and then, well, she doesn't interrupt you, she just says the wrong thing, and I just want to yell at her the whole time, right? But the latency free thing just gets you totally natural. And so I think that's, so getting transitioning to the next news, and this is something that I wanted to talk about is that WWDC is next week, which is Apple's Developer Conference. And it's not necessarily a conference that they really talk about all their new physical products or all their physical releases. They talk about their core software and their operating system stuff, and it's their developer conference. And their stock has been building up to this conference, because everybody's waiting to figure out what Apple's going to do with AI. And I think that they really haven't built a really, I haven't had many official announcements about what they're going to do. But one of the things that we know is that they've been talking to Google, they've been talking to OpenAI. But we now know that Apple is going to revamp their audio, some of their core apps like Notes and Voice Memos, and releasing core audio transcription in a much more advanced way inside of these applications. We also know that iOS 18 is going to probably come built with some sort of M4-ish type chip in the iPhones. And we already know the M4 is going to the iPad, so when you upgrade to iOS 18, it's going to have this stuff built into it. And I guess the first question that came to mind to me is like, is it's typical Apple. They're doing this stuff, and they're building it into these apps that some people don't even, they just swipe through and never even use, but literally once you dig into them, they're like some of the most powerful apps you'd ever use in your life. But is Apple going to underwhelm next week, or is it going to all fit into place and make sense? >> I'm legitimately curious about how this is going to play, because you think about how many folks have iPhones, how many folks have Macs, iPads, whatnot. And suddenly, if this goes out the way that it seems to be rumored to be, everyone's going to have access to these models. It won't be hype anymore. They'll be talking directly to these models, interacting with them, trying to do some new stuff. And that's going to be different. Like everyone has been worried about their jobs. Everyone has been worried about what's going to happen when AI does X, Y, or Z or tries to kill us. And I think the interaction with Siri is going to inform the general population more than any hype train ever has. >> Yeah, that's a good point. And anytime I talk to someone like my parents who don't use these things, they're like, first of all, it's an accessibility issue. Which OpenAI was the best player in the market for accessibility. But if you soon, if you have an iPhone or some Apple device, like you're going to have access to this new tool, this new word processor, this new, whatever, you know, and it'll open up a lot of doors. >> I don't think that, so two things about their release. One is that them talking to OpenAI now, when they're already going to have this conference and they already have iOS baked, shows you that they don't have an answer to who Siri's talking to. >> Right, right, that's, yeah. >> Do you know what I mean? Like they don't know, they don't have that model, the stuff that they're building internally isn't ready. >> It's clearly not good enough to be able to compete with the general population. >> Why would they want to go talk to OpenAI? >> Right, right. >> So I think that the Siri and OpenAI-ish or Gemini, whoever you want, is the play there because it needs to have more than what you can probably bake into a device. >> Yeah, yeah, one of the big things, so I think the perception of how useful it's going to be compared to what you can get from chat GPT and whatnot, it's going to be even more leaps and bounds because there were some news articles a while back on a couple other podcasts that I listened to, and they were talking about how they work with Siri right now by manually editing the answers that you can get from it, or manually editing the questions that you can ask it. And so if we all do, like when we're talking to Siri, we kind of do the Siri dance, when we're talking to these assistants. It's like, I need to save this in this order and that sort of thing, right? And to have that be free form and then interact with something like a functions API or something like that is going to be incredible. >> But is for a couple of things, well, and then there's a second piece of this that we need to talk about, can OpenAI scale to a billion, a two billion Apple users? >> I mean, that's probably a very good question and why they're also talking to Gemini, because if there's one company that has the horsepower, it's Google. Google has more horsepower than anybody. >> I wouldn't sign that contract with them. And the other thing that came out in this pre-announcement is that they have this concept of this secure enclave. Okay, so Apple, what's their number one MO with your data? >> Historically. >> Privacy. >> Yeah. >> And so I think one of the reasons why people might be underwhelmed next week is because they're trying to figure out how to do all this stuff securely and privately. >> Right, so there are no slouches when it comes to machine learning. They have a really interesting blog where they put out some of the research that they're doing, and they were pioneers in differential privacy, where the idea that you can muddle the data in a certain way that you can still do stuff with it. But it effectively anonymizes you. And there's some things here and there that kind of go with that. And I wonder if there's going to be a nice mixing of that. >> Right, I think that they're probably differentiating their product in two ways. The stuff that you create or have created, and the stuff that they need to get from this real-time internet-based LLM situation, with zero latency conversations. Because they know that in Iowa, and I think they're ready for it. You never know, sometimes it feels like they're way ahead and they've thought of everything. And I mean, Steve Jobs is a long gone though, right? But this concept of the iOS 18 inputs that they're doing where you don't have to really press anything on the screen anymore. I haven't seen like exact examples, but they're coming out with basically a no-touch interface. And that's gotta have really, really good voice activation, for example. So it's interesting, but I think my prediction is that initially, I think the market might come out super cold on this conference, potentially, because Apple's really actually trying to do this the right way. And maybe six months from now, we see about how all our data from OpenAI got leaked, and they, you know what I mean? They got hacked, and it says it's all associated with me, and it's everything, you know what I mean? But Apple's gonna be the one that's like, actually trying to do it the right way. >> Well, I think the bigger risk is, is that we've had a hype bubble start with investment in everything with generated AI. Everyone knows this is coming to Siri. Everyone is expecting it. It's going to be, Apple is going to do wonderful marketing around it. It's gonna be incredible to see, you know? And then when people actually interact with it, and they get the example you were mentioning earlier, was glue on pizza, you know? People are going to be very underwhelmed. And what happens then? You have to, this hype bubble has set up AI as Terminator or whatnot. And what happens when it comes slightly under? Like, does that investment still work, or what happens? >> Yeah. >> Are we looking at generative AI with rose-colored glasses? >> Well, I think this is a good pivot into what my news topic is. >> Yes. >> Right, which is all about the hype cycle, and is it actually good? And so my news thing that I was gonna cover was starting with the rabbit R1, and then I ended up on the humane. So we're just gonna talk about AI hardware for the last couple of weeks. >> So rabbit, so rabbit device, talk to it. >> Right. >> Humane? >> So humane is the little Star Trek communicator that you would just tap on your thing, and you could ask questions. Or if you didn't want to chat with it, you could put your hand out, and it's got little lasers, and it draws a little laser thing, like it's 1989 on my hand. And then the rabbit was the orange one that's about the size of a square deck of cards that had the screen, and it had the large action model. >> Which in reality, when you unboxed it, it was just a Samsung phone. >> Well, it's even kind of worse than that. Okay, so let's start with the humane one. So first of all, every reviewer just slammed both of these products. Two, I mean, Marcus, MK, HML, MNOP, whatever, I can never remember his acronym. But anyway, he butchered both of them. >> Oh, yeah, he's good dude. >> Yeah, he's awesome. Like, his stuff is always-- >> You know, he's a professional ultimate Frisbee player? >> That wouldn't surprise me. >> Yeah. >> So, but the pin, which is like 700 bucks for the pin, it overheats the battery life, the connectivity's problematic, it doesn't do the voice recognition very well, it's super slow. Like, it does nothing like that your phone can't do and your phone doesn't do better. So that's fine, whatever. And so people are just, so apparently the humane company is now shopping for buyers already after a month of release, and they're hoping to get somewhere between 700 and a billion dollars, 700 million and a billion dollars for their technology. Moving on to the Rabbit R1, and the Rabbit R1 was the one I was super excited about, we talked about it a lot. >> Oh, we got shredded on YouTube. >> Oh, yeah. >> We were talking about the smartphone. >> It is in a minute and I was like, "You're done, wait, you're not done." >> Yeah, I know now that we're here. And so, but it had this concept of that it had a new foundational model called a large action model. And what it was supposed to be is that it could watch a screen, you could interact with this screen, and then it would be able to understand, "Oh, this is what you're doing with the screen. "I can now play it back." That all made all the sense to me in the world. I'm like, "That's fucking awesome. "This is gonna be great." You were supposed to, in the future, be able to turn it and record your own and train your own stuff. Come to find out. It's playwright. >> Oh my God. >> It's playwright scripts. It's hard-coded playwright scripts. >> I knew there was some bullshit with that. >> There's no, now, maybe. >> So playwright is basically an abstraction layer for you to program clicks or any input into the web. >> Yeah. >> So he can be like, "Hey, I want you to go to Google. "I want you to hit this input. "I can actually record that script "and it generates the code and I can deploy that code." >> Yeah. So apparently, that is how all of these, and that's why none of them are working. Like, you can't order from Uber Eats, you can't do any of this stuff, and it's all, sorry, that service is down right now. And so, yeah, the whole thing. Now, we don't necessarily know, is they, do they maybe have something that's actually writing playwright scripts? Maybe it's a large language model that they can do. Something like this, but everything we're seeing is not. And then people also hacked into it. Basically, it's running, it's just an Android. It's just running the Android, and it's an Android app that people now been able to run like on their own thing. So, that whole thing is just kind of, and apparently, the dude, the CEO, before this, was a huge crypto, there was a big crypto thing that went completely sideways, and that's where he just kind of ran for the hills. So, the whole thing is suspect. He had to be involved in the Fire Festival, then, man. Probably, he played his fight, his screen thing. But, again, in 199, I'd much rather have the mistake of buying the 199 or Avidar one, and find out how it's not that good right now, versus the $700 Humaine pin that's gonna end up in that junk drawer of yours. Gosh, I can't tell you how many war stories that brings up from past jobs and that sort of thing. So, I was at a company once that shall not be named, but they wanted to be able to ask a search bar, like, "Hey, what is this happening in my data?" And they get a result back in all this, and this was 2017 or something like that. And I remember talking to the CPO there, and I'm like, "You know, that's kind of hard." Like, that's gonna take some research, and it's gonna be like six months, or something like that. There's a lot of math in there, and it's like, "Oh, no, fine, we'll just string match against the questions." And no lie, they ended up hiring someone for like 250 K a year, and all they did was sit there and make up questions. Oh, no. And then that question would then be tied to a specific query that they would run against the data, and then, yeah, they would make up questions, and they would string match, and something like that. Like, some of the specifics might be a little blurry, so please don't get angry at me. (laughing) But, yeah, no, no, I can almost see what happened there. Exactly, it's the same thing. Yeah, they hit some bottleneck. It was like, "Oh gosh, this is really hard. "We need to get this going. "We're just gonna crowbar it." It kind of reminds me of, I don't know if we talked about this last season, but the Amazon stores, where you're supposed to just go in and grab, and then leave, there's cash here-less, or whatever, but they couldn't figure it out using computer vision and all that, so they just had cheap labor, yeah. Watching video and saying, "Yeah, that's why you would get "charged for like 24 hours later, "'cause it's time to rewatch that video, "but fake it 'til you make it, I guess." Well, so I really think that these two companies screwed up just massively, and not making this a developer release. Just do, we have a small developer release. Here's the price point, there's only gonna be, I don't know, a thousand of 'em. I mean, the Rabbit R1 sold like 20 million. I mean, some huge ass number. I mean, they sold a ton. - What? Oh yeah, it was crazy. I mean, we were talking about it. Yeah, I mean, they went viral. Both of them went viral. - We should have gotten a referral. But man, I mean, it was like all of them just completely not ready for prime time. Use it as, do the, like what Googled it with the Vision Pro, right? Like, again, they found out. Yeah, they're not a whole lot of people. Think about that type though, right? So the guy's already been through a hype cycle. He didn't, he wasn't into crypto 'cause he valued it. He was in it to make money. - That's true. And he probably saw the AI thing earlier than other people did. - He did. And he said, "Here's another hype thing." And I can, you know what? They came out with a product right at the right time. He was talking about fall, 2023. The peak of uncertainty and everything's magical. Right. - Right? And say sold $20 million. And so they, so they, so they, so okay, they didn't sell $20 million. Let's say they sold $10 million, whatever it is. But then they also raised like $20 million. So they've got, the VCs are also breathing down their neck. And so like, that's another problem with all these. Yeah, I mean, that's my whole problem. Well, go ahead. Oh, no, sorry. - I'm just gonna talk to you about VCs. (laughing) More comments, but. (laughing) No, no, the big question that I have around it is why do you need this new device? Yeah. You know, like we have this watch. These watches, which are great. We have this phone. We have everything there. Like, why do we need this new device? And I don't think anyone has convincingly answered that question. Well, both of them are like, we need to be staying away from our phone. It's pretty much, we're on our phones too much. So we need, it's like, then we just need a new device that we're gonna be spending. This is so long. So this is the premise that we, I mean, I personally, I think as a team we got to towards the end of season one, which is, who are you think you're gonna beat here? Apple's gonna come out with iOS 18 and all you voice people and chat GPT overlay people are gonna be dead, right? Like Microsoft already invested in that and has that part, you know, Google has a huge, huge infrastructure for dealing with all this stuff in a great model. Like, do you really think that you're going to compete on product design or infrastructure or software with these big players? It's getting harder and harder. I mean, like, who's gonna compete with Nvidia, right? Like Nvidia right now is just, I mean, they're the, you know, the thousand pound gorilla and to compete with them, you need to have an infrastructure set up that's like impossible. And so to me, this is a question for you, but like, where is the opportunity? Oh, that's a small question right there. (all laughing) No, wow, tell us. Let me know, let me know. I think the opportunity is in trying to harness these things that are being put out and trying to help in on the services side, which is obviously I'm in the services and so are you, but like, it's implementing this stuff, not necessarily inventing it. So, you know, there's like a few different scales about where the opportunities really sit with these LLMs and that sort of thing. There are like the services implementation side, what can you do with the data? What can you, what not with it? And then there is, you know, is this truly the correct way to go about doing some of this AI stuff? It's an open question right now if these models are actually too large or anything, like it's too large, it's too complicated, it requires all this extra hardware, specialized hardware in order to run and no one is a hundred percent sure that's the case. This just happens to be the best models that we've found so far. So when you think about like opportunities there, mining for a different approach or something like that could be very fruitful or, I mean, honestly, it could also be a waste of money in time, but-- Do you think we get to a point where instead of having, you know, these massive large language models, instead we have a handful of smaller language models that are kind of controlled by like an orchestrator or somebody that says, oh, here's what they're asking, which model should I use and just-- So I could be misremembering, but there is some aspect of that with GPT-4 right now. Oh, yeah, that's right. They talked a little bit about that. Well, that's what Jim and I does, for sure. They have a bunch of underlying models, but yeah. So-- So depending on the nature of what the implement is-- There's like an input-- That's the human in the middle is, you know, what's code, send it to this guy. It just got some poor bass, you just sit there all night. Okay, there, there, there. The switchboard is a switchboard. Yeah, I mean, it is one of those things though, like when you think about the opportunities that sit out there with development of AI and the macro sense, our brains are very capable machines and yet you think about the power consumption to do what we're doing and how we think about things versus the power consumption. There's something there that could be refined likely. In an immediate term, like when we think about opportunities with what you can do with the stuff, and that sort of thing, from a server's standpoint, a lot of it is trying to figure out, like, what is the actual application of an LLM? Or what is the actual application of AI or whatnot? I don't know if that answers your question too much. No, and I think that like, that kind of segues to some of the more data science-specific questions, but from everybody, some people didn't even think about AI for 20 years until what, a year ago or less, right? And that was really fundamentally associated with LLMs. Oh yeah, so the history around this stuff is really, really interesting. So this is not the first hype cycle with AI. It happened in the '80s and early '90s and that's where there was an AI winter that occurred because it didn't deliver on the promises. Now, there's no reason to think that's the case right now, but yeah, it's kind of, we don't know where we're at right now. Well, it wasn't a similar situation to like, they didn't have the compute back then, like similar to the .com bubble, where the chewies or the pre, the pets.com, there's a distribution for it, right? Yeah, so who was doing it though back then? Oh gosh, that's a good question. I can point you at the research, but yeah. But like the neural network computing concept was invented in like the '60s wasn't it? Yeah, yeah. And it still holds true, like that architecture exists. Yeah, to an extent, like there's been modifications that have happened here and there, like how does math work and all that. A big thing that was happening in the '90s, early 2000s, if my memory holds is, folks were finding out that they didn't need neural networks to do things. It's like, oh, if we happen to kind of transform the data in a certain different way and we can draw a line between it, we effectively solve the problem, but without all these unconstrained parameters or anything that sits in any neural network that you have. And it could just be the math nerd in me, but I do wonder if there's an opportunity to do a lot of this large language stuff and with far less resources using a simpler concept that gets to the heart of everything. It seems like that that is a very logical path that we would go down. Because again, like you were saying, we do all this computation in our brain for one kilowatt, whatever it is, right? And then, so the fact that we're just missing some certain thing that somebody comes out with, some breakthrough that all of a sudden now makes it so we can even scale things up faster. Well, the architecture's proven, right? We exist and we build up like sky-time person stuff. Yeah. Like pyramids or maybe those aliens, I don't know, they probably got really good neural networks. But no, like, so coming back to that kind of thing, all I mean, that's not the only thing that you do, right? That was this new thing that, you know, popularized by Chet GPT, but like for a data scientist, like is it a distracting thing that's happening now where everyone's coming and saying I need AI and they're really talking about LLMs when they should be just using math or like traditional, you know, machine learning? Like, what is that like landscape, like, right? So, yeah, yeah. So this has been something that's kind of interesting. First off, like when people come in and say, I want to use an LLM or something like that, usually what they're aiming for is solving some sort of problem, right? It's like I have this idea, I need to solve this problem, this is what I've heard through my networks that this is how you're supposed to solve it and that's where to think. And that's fine. I mean, these folks have companies to run, you know, they're not sitting, watching Hacker News and all that sort of stuff. They need to get to a solution and that sort of thing. So that actually is pretty good because people think, hey, this problem can be solved by a computer, which is a new thing. You know, it's like, oh, it's not billion receiving anymore that has the magic experience to do what he needs to do. It's like, no, no, no, I think AI can actually do this. So it opens up a lot of conversations. Where it fails is where this hype cycle hits because people start getting unrealistic expectations. And also, most importantly, they don't know how to measure what is good versus not. Right, we have this concept that you talk about, which is like plausibility versus accuracy. Yeah, yeah. And like explain the difference and I think it's important for people to understand that. Yeah, so if we go back to basics about how these models were actually trained and that sort of thing, like you basically get some text, you vectorize it up, you do a whole bunch of stuff. And the goal is that you're going to predict the next token, right? And that's super duper cool. Like you can produce reasonable sounding text, that sort of thing and do a bunch of things to it in order to get this result that's out. At no point in there, is there any kind of correctness built in? What's going on, at least in the classical transformer view, I don't want to get a bunch of comments about well, actually and whatnot. So. (laughing) And we will, I guarantee you. Yeah, I get that there are other things that are going inside of it. But if you think back to basics, correctness isn't built in by default. It's something that you have to add to the model. And that leads to what you're actually doing is trying to find something as plausible. Like, that's what the model is actually optimizing for. And this is why it's so good at convincing us that it's solved the Riemann hypothesis, which is this big prime number thing that's out there. Or it's going to cure a world hunger or something like that. Because it is very, very good at convincing us that it knows what it's talking about. But isn't that just what humans are really good at doing anything else? Yeah, yeah, you'll be amazed what people say with confidence, what they'll believe. Oh yes, the sky is purple, you know? And people will take that at face value if you say it in the right way. And chat GBT and the various LMs that are out there are very, very, very good at that. Yeah, that's interesting. It's like, it's like, it's probably going to be right. Yeah, and then you're like, you know, most people are just like, ah, that's good, you know, and just move on. So what happens if we just keep doing it and we're just layering and layering and layering on-- The layers, the layers scare me. Yeah, truth, truth is gone. And we talked about it, here we go again. Last season, we talked about when you remove the citation. When you don't under, when you start feeding in synthetic data, you might raise a generation of complete fucking idiots. (laughing) So, so, I actually, I have some more, a little bit more-- Who wants to approach? So I'm a little bit more of an optimist when it comes to that. So we are going to go through a disinformation thing, right? And we are going to have this thing where you can generate false data very, very quickly. And it will make some claims that doesn't really get fought, can make images that aren't real, all sorts of stuff. But I would argue that there is no benefit for someone to take that at face value. And so what's gonna happen is that people are going to be trained by these models unintentionally to be critical thinkers about what's going on. And we've seen that. Like, there's Instagram posts of all these guys, these NBA players in their postgame press conference. Those are hilarious-- Just same things. - Just same things. Completely outside, outlandish, deep fake stuff. It'll be like, yeah, my coach is trash. I'm the best player on this team. Well, and you just immediately recognize that it's fake. And we've talked about this before, where it's like, when Facebook got popular and the older people started giving on there, now I might be one of them, but anyway, they get on there and they took it all for face value initially. And you had this problem with this information, you still have disinformation, but you did have that problem. But over time people, they were patterning-- Do you think, so we'll go back to 2016, when the height, well probably the height of disinformation on social media, I mean, is it though? Is like, have we learned? Has it gotten better? No, it has not gotten any better. Do you think it's gotten better? Well, I mean, people still do take things at face value. I will say that. He's optimistic that people are gonna become-- Do you think it's all gonna burn down? I don't, I think though, it's probably gonna take this youngest generation who's born into it to really be like, you guys are a bunch of idiots. How did you fall for this? We're all be sitting there being like, "Hey, nuclear war is starting tomorrow!" So, I mean, there are some waves of this that kind of go along. Like, I remember when I was in high school, Wikipedia, no teacher mind, you talk Wikipedia. Oh my gosh, now it's like shining. Oh yeah, no, absolutely. And so, you know, you'll have a hype cycle where folks are like, trust what you see on the internet. I think there used to be some rule that if it ended with an ORG or something, you could trust it more or something like that. Yeah, yeah. I had a ceramics teacher that said, "If it shows up on the internet three times, "it can be treated as fact." So, I wouldn't put some fact about him on there. Three different places and I was like, "Look at this." (laughing) Yeah, and, you know, it might be too late for the millennials, you know, but I do think like, folks, we wouldn't have gotten to where we're at if people didn't have critical thinking skills. And I mean that from the very, very bottom to the very, very top. Folks have a capability. This is probably the most human thing about us if we're thinking about advantages that humans have over AI to ask questions interrogate data and, you know, come up to a reasonable conclusion. Sometimes they're wrong. I mean, sometimes they think, you know, the sun orbits Earth and whatnot. Right. But this has been the tale of the last 500 years of like applying critical thinking and moving forward because of that. I just help me understand why we have so many damn flat Earthers. Oh, I think that's a meme more than anything. Yeah, but I think that I don't-- I don't know, I follow these people. They drive me nuts. I genuinely think the flat Earthers would flip their opinion tomorrow if they genuinely trusted the government, right? I think it comes from mistrust. I don't think they actually believe any, personally. 'Cause like, there's just no way, right? Like, why isn't there a flat Mars or a flat moon? And it's because they say we can observe that they're not flat, right? So that's assuming that space is real and that those things aren't just plants that are-- Yeah, no, that's true. And that's actually, this is more reflective of something that's outside of the AI in particular. Everyone has the tools without binoculars or anything. If they just have a ruler or something, you can go verify that the world is indeed round. And you can actually get a pretty good estimate just by looking at shadows and that sort of thing. What we don't do is enable everyone in order to ask those questions and say, okay, if I have this hypothesis that the world is round, how would I go about proving it? It gets back to the accuracy thing too. It's like, how do you know what's good or not? It's the same kind of concept which is called the scientific method. And that is the only way to verify this information. And I think that, you know, that kind of leads into this concept of, you know, living inside of this system. Right? And people, there's people that live inside the system that just can't accept the system or the controls of the system. And they'll do anything, they'll say they're flat. The government's, you know, ran by the freemason. One or something. One time, one time, one time. One time, one time, one time. For the, for the, you know, the devil organ. You see, that's the one that I haven't heard. (laughing) We're real, it's the Illuminati. Oh, okay. But I think that that's, that's a concept, you know, and I think that, you know, you look at, you look at LLMs and you look at, we talked about the solar system and the three body problem, right? And you're like, what is to say that introducing a, you know, slight, you know, slight glitch in the matrix or a disturbance in the force can't bring it all down? Oh, absolutely. I mean, this is, this is the nature of math and that sort of thing. You know, if you go through the science and all that, the answer is, well, it hasn't happened yet. (laughing) So, so I think we're okay for right now. But now, I mean, I do think that gets to something pretty fundamental is that I feel like, and I see this a lot in some of the products that we do that people are uncomfortable with uncertainty. And they're not able to really ascertain, like, when I am giving language about confidence or uncertainty, like, what does that actually mean, right? And, you know, what are the implications of that? And kind of like, what are the betting odds that-- Well, is it true? Yeah, it's kind of, maybe the future is like, it's like C3P, that's like, the odds of a surviving and that's true, I feel like 3,000 people-- I never thought about that. But it's like, you know what I mean? Like, maybe that's a better way of doing it versus the, the, and maybe that's how we figure this out, is like, the LLM comes back and says, or the robot that's doing your dishes or whatever, is like, there's gotta be some sort of confidence metric, or, you know what I mean? That goes along with some of the stuff, 'cause it's just getting fed back to us right now. Yeah, and we learn over time what that means, how the reason about that? Well, and the biggest thing, so they are doing elements of this with like, human in the loop, like, up, down, whatnot, and asking stuff. I get into arguments about C3PT about math quite a lot. And, you know, it's like, what answers are better, what's going on, and that sort of thing. The problem is, though, is then you're relying on the, the knowledge of the people, in order to make that ascertainment. And one of the key critical parts of the scientific method is reproducibility, which becomes really, really difficult, you know, when you're trying to figure out what is fact versus fiction, and then you end up overvaluing certain sources of the data versus others, which gets back to what we were talking about before. And you think about the implications of that, especially if you have beliefs that maybe don't align with what seems to be indicated about the data. That's where a lot of trouble gets started. Right, now that makes a lot of sense. Well, I think we can call it. Call it. Season two, episode one. Episode one in the back. Thank you, Zach, for coming and eliminating all this great information. How can they learn more about you and your company? All right, so as we said at the top, I'm at data317.com. We build data science products, that sort of thing. You can, well, I said the web address and I, yeah. So data317.com, that's where you can find us. You can go to LinkedIn. Yeah, we like to learn out about math. And so if you have a math problem or a business problem that just isn't behaving itself or something, give us a call. Awesome, awesome. Yeah, thank you so much for coming. All right guys, we'll see you next week. Next week

Social

Show Notes

Subscribe

Let's Connect