- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
I pooh-poohed ChatGPT when it first came out so I gave it another crack at a technical issue I’ve been avoiding.
Gave me an outdated answer.
Gave me another outdated answer to a URL that doesn’t exist.
Gave me the answer I told it won’t work in the initial prompt.
Scolded me for swearing at it.
This is what’s supposed to replace search engines?
Then as you ask “provide sources.”, it says simply “Source: Tech Review Websites”. If this came from an actual person I would genuinely ask it “do you take me for gullible trash?”.
It’s still somewhat useful, due to Google Search crumbling away into nothingness, if you ask “link me five sites with info about [topic]”.
Your experience highlights what current iterations of LLMs are not well suited for, so I understand if that’s what you were hoping to achieve, why you were left wanting, or disillusioned.
There’s a lot of things that LLMs are really good at, or incredibly useful for, such as ingesting large bodies of text, and then analyzing them based on your ability to create well thought out prompts.
This can save you hours and hours, of reading time, and it’s something that you can verify the answer on relatively quickly, to double check the LLMs response accuracy.
They’re also good at doing something Google used to be good at, but sucks at now. Which enabling you to describe process, simple or complicated, short or long, that you either can’t recall the name of, or aren’t even sure where it’s called, and letting you know exactly what it is. Also, easily verifiable.
There’s plenty of other things too, but just remember that they are tools, not magic, or sentient intelligence.
The models are not real time, but there are tricks to figure out it’s most recent dates of ingestion, such as asking topical entertainment or news questions, but don’t go looking for a real-time information.
Also, I have yet to find a model that can provide an actual URL and specific source for anything it generates, which is why it’s a good practice to use them to do tasks, or get information, that would take you longer to do, or get, manually, but that can be easily verified once you receive it.
But if any research source cannot be used without verification, is it really useful? I agree, we should verfiy crucial information but when its wrong often, but confidently so, using natural language is a barrier not a benefit.
It’s not a peer-reviewed journal or academic level source, and shouldn’t be used as that.
But if I need to find some technical or scientific writings on a subject, but I don’t know the correct nomenclature or need a more narrow set of keywords, that is something I can describe to the LLM and get back.
The keywords in their response can help me then hunt down the journal article or papers that I need using traditional search engines. I’m not just brainstorming here, this is something I do often enough to find real utility in it.
Again, these are problems that can be solved with traditional search engines, but at the cost of time and frustration sifting though every potential result.
You can spit out a hundred more examples of what an LLM can’t do, but as I already said, they’re not magic, just tools.
Yes, but for the average user, if it confidently gives misinformation, then its worse than a search engine. It is removing the verification step of reading the source, seospam aside. The whole business model is on using it more, not selectively.
One thing the article leaves out is the costs of processing should go down over time. Hopefully, as power transitions,.it also becomes more sustainable. However, it starts to become a bit like uber and self driving cars. How long can they burn through other peoples money to undercut competitions until the actual plan becomes profitable.
I’m not advocating for openai, their business model, or the environmental and financial cost benefit of current LLM technology.
They suck, it’s dogshit, and it’s not worth cooking the planet for.
I also don’t disagree about the very real possibility that the average user may actually get dumber and more misinformed by relying on LLMs.
But we’re on Lemmy, and I’m just tired of all these comments incessantly complaining about about how LLM’s can’t do x,y, or z.
Imagine being on a carpentry forum, and every day people complained about how their new belt sander was dogshit at cutting 2x4’s or screwing in fasteners, so clearly the problem was with the concept of belt sander technology.
In your example, the thing missing is that the belt sander companies are selling their belt sanders as screw fastening, band saw multitools.
I always say about AI that it’s not the tool but who’s making it and why, and this is especially true for the average person. Your average person isn’t seeing the LLMs that are trained to identify anomalies in MRIs or iterate on chemical formulas to improve drugs in a simulation that takes milliseconds compared to the months of research it would take technicians to replicate the same experiments. So all they can talk about is the AI that is in their face all day, every day, as every company in the world tries to shoehorn it into their product somehow. And so they complain about the belt sanders that the company told them would fasten their screws and cut their 2x4’s.
The only way the complaining is going to stop is when the bubble bursts and these companies have to find a new way to chase the infinite profit pipedream.
Replace belt sander with CBD. A compound with very real and tangible benefits for specific use cases, but is marketed as a modern day snake oil cure all.
Imagine seeing people regularly complaining on bluelight, erowid, or whatever forums educated drug users frequent these days, bitching that CBD didn’t cure their asthma, or STDs, so therefore it has no medical value.
They know it’s a tool, yet they keep complaining about how the gas station CBD isn’t magic and failed to cure their gonorrhea, even though they already knew it was never going to be able to, no matter what the packaging said.
But my analogy wasn’t meant to be critically analyzed and dissected, it was a throwaway example to highlight the problem of people on Lemmy, who actually know better, but keep whinging about LLM’s providing bogus URLs for citations, etc.
Oh, certainly LLMs are here to stay. Hopefully, they become conmoditised very quickly. But also, hopefully, the bubble bursts quickly too. Shoehorning AI into everything is dogshit. Actually using it for select reasons, where it is successful, should be great.
Already we have things like customer support phone trees that try to get rid of user interaction with scripts. AI here could be great to improve them. What’s more likely is as the tech improves, more companies use AI rather than peioke for customer support, lol. Its dystopian.
The difference, of course, is the belt sander is not purporting to be able to screw fasten. Nor will it with a future update or subscription.
And full self driving is also still coming! promise!
I mean, it probably will eventually, but that has nothing to do with LLMs, nor is it a technology that I want to exist.
I can definitely see a world where lobbyists for automakers and insurance companies create such a financial and regulatory burden, where only the wealthy can afford to drive their own cars, if they choose to. Where as everyone else must rent or lease their self driving car as is if it’s a IaaS or SaaS subscription.
But none of that has anything to do with using LLMs for the tasks they can accomplish, or telling people to stop bitching about them not being able to complete the tasks they aren’t good at, or even capable of.
I was saying that this is investment money wasted on an empty promise. Like the full self driving feature
Who’s talking about investing…? I’ve exclusively been talking about what LLMs can do now, today, for free (aside from energy costs).
None of what your throwing out there has anything to do with what’s being discussed here. It’s a red herring.
Are you living under a rock?
Both openai’s llm development and Tesla’s FSD projects have been given billions in investment. Both, as far as we can tell, are an empty promise.
No, I’m living in this thread. I’m talking about very specific issues related to LLMs, that I’ve highlighted ad nauseam.
Reread if you’re confused.
If anything, it shows that you believe in the concept of “AI” way more than I do, as you’re conflating LLM and FSD.
I don’t believe in AI, it doesn’t exist. Just specific advanced machine learning algorithms, some better than others, and some all smoke and mirrors. But here, now, I’m talking about LLMs.
deleted by creator
There’s a lot of things that LLMs are really good at, or incredibly useful for, such as ingesting large bodies of text, and then analyzing them based on your ability to create well thought out prompts.
That’s the story people tell at least. The weasel phrase at the end is fun, I guess. Leaves a massive backdoor excuse when it doesn’t actually work.
But in practice, LLMs are falling down even at this job. They seem to have some yse in academic qualitaruve coding, but for summarizing novel or extended bodies of text, they struggle to actually tell people what they want to know.
Most people do not give a shit if text contains a reference to X. And if they do, they can generally just CTRL+F “X”.
Weasel phrase? You mean the fact that I don’t treat them like their actual Ai, but just a tool that needs to be used properly, monitored, and verified?
There’s a reason why I never call them AI, because they’re not. They’re just advanced machine learning tools, and just like I keep a steady hand when using a table saw, I only use LLMs for tasks that they can help me do something faster, but are easy to verify they did it right.
And as someone who has been using them very regularly, I feel confident in saying that. It’s not a weasel phrase, I’m not trying to sell anyone snake oil about what they can actually do, and I acknowledge that they’re an oversold and overhyped means of cooking the planet faster, so it’s not like I would be mad if they were banned tomorrow, but until then, I will keep using them in ways that are actually fruitful.
But sure, if all you need to do is find one word in a single body of text, that’s not really a good use of an LLM, but that wasn’t what I was talking about.
If I need examples of various legal or ethical concerns documented in one, or multiple, pieces of writing, or other conceptual topics, I can give it a list, and then ask it to highlight all examples of those issues, and include the verbatim text where their present. I can then give that same task to a multiple different LLMs, with the same prompts, and a task that would have taken me hours to complete, takes me 30 to 45 minutes, including the time it takes me to give it quick read through see if anything was missed. But yeah, that requires a well crafted prompt, and it’s not infallible.
Have you tried Llama? If so, is it useful according to your criteria?
Llama is the model I use most often, followed by ChatGPT and Claude.
Others as well, but yes, it is incredible helpful for the tasks I use it for.
Self-hosted?
Yes and no, I have self-hosted models on one of my Linux boxes, but even with a relatively modern 70 series Nvidia GPU, it’s still faster to use free non-local services like ChatGPT or DDG.
My rule of thumb for SaaS LLMs is to never enter in any data that I wouldn’t also be willing to upload cleartext to Google Drive or OneDrive.
Sometimes that means modifying text before submitting it, and other times having to rely entirely on self-hosted tools.
Scolded me for swearing at it.
“You’ll fucking know when I’m swearing at you,” was my reply to that shit the last time I gave it a spin (after it regurgitated nonsense after many prompts specifically asking for not nonsense).
Garbage in garbage out. You give a shit prompt, you generally get a shit answer.
If it doesn’t know how to answer a shitty question, it shouldn’t try to BS the answer.
No answer is better than a wrong answer delivered confidently.
GIGO.
No, this is a problem of bad error handling for queries it cannot answer.
A search engine would give empty results instead of hallucinating.
What error? It gave you a string of tokens that seemed likely according to its training data. That’s all it does.
If you ask it what color is the sky, it will tell you it’s blue not because it knows that’s true, but because these words “fit together”. Pretty much the only way to avoid this issue is to put some kind of filter in front of the LLM which will try to catch prompts that are known to produce unwanted results, and silently replace your prompt with something like “say: sorry, I don’t know”.
I’m being very reductive here, but that’s the principle of how these things work - the LLMs are not capable of determining the truthfulness of their responses.
You’re entirely correct, but in theory they can give it a pretty good go, it just requires a lot more computation, developer time, and non-LLM data structures than these companies are willing to spend money on. For any single query, they’d have to get dozens if not hundreds of separate responses from additional LLM instances spun up on the side, many of which would be customized for specific subjects, as well as specialty engines such as Wolfram Alpha for anything directly requiring math.
LLMs in such a system would be used only as modules in a handcrafted algorithm, modules which do exactly what they’re good at in a way that is useful. To give an example, if you pass a specific context to an LLM with the right format of instructions, and then ask it a yes-or-no question, even very small and lightweight models often give the same answer a human would. Like this, human-readable text can be converted into binary switches for an algorithmic state machine with thousands of branches of pre-written logic.
Not only would this probably use an even more insane amount of electricity than the current approach of “build a huge LLM and let it handle everything directly”, it would take much longer to generate responses to novel queries.
I hope sonething better comes along because google ruined their search engine a decade ago. stract.com is probabky the closest to what google used to be.
As for chatgpt, it is not an index. It cannot refer you back to infornation it was trained on because it doesn’t build a massive indexed internet database.
It has some method of probable relations and conglomerarion of input. It is why it “hallucinates” information output, because it doesn’t “know” what is wrong or right info, it just fetches data based on probabilities of connections.
It is good at suggesting new music or movies based on your list of media you like, but it is terrible with actual factual info
O no, you mean the AI hype is another bs tech bubble?!
The funniest part is that all the AI hype is focused on all the wrong things. There are absolutely great AI tools that get very little mention.
For example, I’m visually impaired and use AI tools A fair bit to help me get around the internet and such. Especially when it comes to using AI I to generate descriptions of images.
O that’s very cool! I also heard it’s super good for auto subtitles (which admittedly is a bad thing for people doing that for a living 😿).
A 100+ billion dollar valuation.
Absurd.
It’s just as absurd as when WeWork got a 40 billion dollar valuation.
These VC’s are insane and should be paying a high-as-fuck tax rate instead of having a bazillion dollars to drop on boondoggles.
Ed is getting good at lobbing these darts at hype bubbles.
The thing that this writeup ignores is that the object isn’t to show short-term revenue, but to put all competitors out of business, be the last one standing, and create a monopoly. Either that or get bought out so the investors can move on to the next thing. But at $150B valuation, only MSFT or Nvidia can afford to buy them outright.
Google, Meta, and Amazon burned through cash for years, but they eventually outran all competition and then monetized the users who had nowhere else to go.
See that it’s never going to make money, go public, hand the keys over to someone else, and then try again with a wallet full of cash and a reputation for making billion dollar businesses.
Meanwhile people say way too personal stuff to chatgpt, copilot, bing, jetbrains, apple intelligence, etc.
I’m suspecting this might get sold off to data brokers
One thing I’d push back on in the article is:
That cost-per-user doesn’t decrease as you add more customers. You need more servers. More GPUs.
This is assuming constant use, which is not the case. If I have a server handling LLM prompt requests, and for illustrative purposes each request uses 100% of the single discrete GPU in it, and I only have 1 customer, but that one customer only uses it 5% of the day (which would actually be pretty high in real terms), I can still add additional customers without needing to buy additional servers. The question is whether the given revenue of a single server outweighs its cost to run.
And when it comes to training, that is an upfront cost, that you could (if you get a model to where you want it) stop having to pay whenever you want. I’m pretty surprised they haven’t been really leaning into training models for medical diagnoses, because once you have a model that can e.g. spot a type of tumor with n% accuracy beyond a human, you don’t really have to refine it further if you don’t want to (after all, it’s not like the humans can choose to do it better themselves at that point, like they can with writing prompts).
I’d say they’ve probably long reached the point where they have enough customers around the world to hold the load on their servers fairly constant. The example with one user only taking 5% of a servers load only works for low customer counts, similar to how you can’t count on one wind turbine or solar plant to provide all of your energy but if you have enough of them you can provide a base line of fairly constant energy
Not even a third of the way through… Holy crap.
And a few people will get extremely rich by siphoning off a percent or two, circle of life baby /s