Can I just say that Google AI Studio with latest Gemini is stunningly, amazingly, game changingly impressive.
It leaves Claude and ChatGPT's coding looking like they are from a different century. It's hard to believe these changes are coming in factors of weeks and months. Last month i could not believe how good Claude is. Today I'm not sure how I could continue programming without Google Gemini in my toolkit.
Gemini AI Studio is such a giant leap ahead in programming I have to pinch myself when I'm using it.
Eezee 11 minutes ago [-]
I tried it out because of your comment and the very first prompt Gemini 2.5 Pro hallucinated a non-existant plugin including detailed usage instructions.
Not really my idea of good.
CuriouslyC 12 hours ago [-]
I'm really surprised more people haven't caught on. Claude can one shot small stuff of similar complexity, but as soon as you start to really push the model into longer, more involved use cases Gemini pulls way ahead. The context handling is so impressive, in addition to using it for coding agents, I use Gemini as a beta reader for a fairly long manuscript (~85k words) and it absolutely nails it, providing a high level report that's comparable to what a solid human beta reader would provide in seconds.
snthpy 2 hours ago [-]
I also used it to "vibe write" a short story. I use it similarly to vibe coding, I give the theme and structure of the story along with the major sections and tensions and conflicts I want to express and then it filled in the words in my chosen style. I also created an editor persona and then we went back and forth between the editor and writer personas to refine the story.
My writing process is a bit different from my coding process with AI, it's more of an iterative refinement process.
I tend to form the story arc in my head, and outline the major events in a timeline, and create very short summaries of important scenes, then use AI to turn those summaries into rough narrative outlines by asking me questions and then using my answers to fill in the details.
Next I'll feed that abbreviated manuscript into AI and brainstorm as to what's missing/where the flow could use improvement/etc with no consideration for prose quality, and start filling in gaps with new scenes until I feel like I have a compelling rough outline.
Then I just plow from beginning to end rewriting each chapter, first with AI to do a "beta" draft, then I rewrite significant chunks by hand to make things really sharp.
After this is done I'll feed the manuscript back into AI and get it to beta read given my target audience profile and ambitions for the book, and ask it to provide me feedback on how I can improve the book. Then I start editing based on this, occasionally adding/deleting scenes or overhauling ones that don't quite work based on a combination of my and AI's estimation. When Gemini starts telling me it can't think of much to improve the manuscript that's when it's time for human beta readers.
snthpy 40 minutes ago [-]
Thank you for sharing that. I'm going to try that up to "then I rewrite significant chunks by hand to make things really sharp". I'm not a writer a would have never dreamed of writing anything until I gave this a try. I've often had ideas for stories though and using Gemini to bring these to "paper" has felt like a superpower similar how it must feel for people who can't code but now can able to create apps thanks to AI. I think it's a really exciting time!
I've been wondering about what the legalities of the generated content are though since we know that a lot of the artistic source content was used without consent?C an I put the stories on my blog? Or, not that I wanted to, publish them? I guess people use AI generated code everywhere so I guess for practical purposes the cat is out the bag and won't be put back in again.
CuriouslyC 28 minutes ago [-]
If you've put manual work into curating and assembling AI output, you have copyright. It's only not copyrightable if you had the AI one shot something.
wewewedxfgdf 12 hours ago [-]
It is absolutely the greatest golden age in programming ever - all these infinitely wealthy companies spending bajillions competing on who can make the best programming companion.
Apart from the apologising. It's silly when the AI apologises with ever more sincere apologies. There should be no apologies from AIs.
yujzgzc 9 hours ago [-]
You're absolutely right! My mistake. I'll be careful about apologizing too much in the future.
DonHopkins 4 hours ago [-]
You sound like a Canadian LLM!
conartist6 40 minutes ago [-]
Wow yeah I'm old enough to remember when the focus wasn't on the programmers, but on the people the programs were written for.
We used to serve others, but now people are so excited about serving themselves first that there's almost no talk of service to others at all anymore
thingsilearned 12 hours ago [-]
companion or replacement?
tonyhart7 8 hours ago [-]
they would replace entire software department until AI make bug because endless changes into your javascript framework then they would hire human again to make fix
we literally creating solution for our own problem
Terr_ 11 hours ago [-]
... Or saboteur. :p
surgical_fire 7 hours ago [-]
[dead]
paganel 2 hours ago [-]
> It is absolutely the greatest golden age in programming ever
It depends, because you now have to pay in order to be able to compete against other programmers who're also using AI tools, it wasn't like that in what I'd call the true "golden age", basically the '90s - early part of the 2000s, when the internet was already a thing and one could put together something very cool with just a "basic" text editor.
koakuma-chan 10 hours ago [-]
And Gemini is free.
harvey9 6 hours ago [-]
The first hit is always free
conartist6 38 minutes ago [-]
The investors know it. They're not competing to own this shit like it's gonna stay free.
hfgjbcgjbvg 2 hours ago [-]
Real.
scuol 8 hours ago [-]
Well, as with many of Google's services, you pay with your data.
Pay-as-you-go with Gemini does not snort your data for their own purposes (allegedly...).
maksimur 7 hours ago [-]
Undoubtedly, but a significant positive aspect is the democratization of this technology that enables access for people who could not afford it, not productively, that is.
nativeit 5 hours ago [-]
We’re all paying for this. In this case, the costs are only abstract, rather than the competing subscription options that are indeed quite tangible _and_ abstract.
in_ab 7 hours ago [-]
I asked it to make some changes to the code it wrote. But it kept pumping out the same code with more and more comments to justify itself. After the third attempt I realized I could have done it myself in less time.
reacharavindh 2 hours ago [-]
I use Gemini2.5 Pro through work and it is excellent. However, I use Claude 3.7 Sonnet via API for personal use using money added to their account.
I couldn’t find a way to use Gemini like a prepaid plan. I ain’t giving my credit card to Google for an LLM that can easily charge me hundreds or thousands of EUR.
conartist6 42 minutes ago [-]
You don't worry that you can't think anymore without paying google to think for you?
conartist6 10 minutes ago [-]
OK, a better scenario than that: for some reason they cut you off. They're a huge company, they don't really care, and you would have no recourse. Many people live this story. Where once you were a programmer, if Google convinces you to eliminate your self-reliance they can then remotely turn off you being a programmer. There are other people who will use those GPU cycles to be programmers! Google will still make money.
DHolzer 2 hours ago [-]
without wanting to sound overly sceptical, what exactly makes you think it performs so much better compared to claude and chatgpt?
Is there any concrete example that makes it really obvious?
I had no such success with it so far and i would really like to see the clear cut between the gemini and the others.
lifty 5 hours ago [-]
Excuse my ignorance, but is the good experience somehow influenced by Google AI Studio as well or only by the capability of the model itself? I know Gemini 2.5 is good, have been using it myself for a while. I still switch between Sonnet and Gemini, because I feel Claude code does some things better.
alecco 5 hours ago [-]
Remember when Microsoft started to do good things? Big corps suck when they are on top and unchallenged. It's imperative to reduce their monopolies.
Gud 2 hours ago [-]
No, I don’t.
ionwake 2 hours ago [-]
lmao
noosphr 11 hours ago [-]
It always is for the first week. Then you find out that the last 10% matter a lot more than than the other 90%. And finally they turn off the high compute version and you're left with a brain dead model that loses to a 32b local model half the time.
alostpuppy 9 hours ago [-]
How do you use it exactly? Does it integrate with any IDEs?
Mossy9 8 hours ago [-]
Jetbrains AI recently added (beta) access to Gemini Pro 2.5 and there's of course plugins like Continue.dev that provide access to pretty much anything with an API
I have not used any IDEs like Cursor or Zed, so I am not sure what I should be using (on Linux). I typically just get on Claude (claude.ai) or ChatGPT and do everything manually. It has worked fine for me so far, but if there is a way to reduce friction, I am willing to give it a try. I do not really need anything advanced, however. I just want to feed it the whole codebase (at times), some documentation, and then provide prompts. I mostly care about support for Claude and perhaps Gemini (would like to try it out).
bossyTeacher 1 hours ago [-]
> Today I'm not sure how I could continue programming without Google Gemini in my toolkit
Anyone else concerned about this kind of statements? Make no mistake, everyone. We are living in a LLM bubble (not an AI bubble as none of these companies are actually interested in AI as such as moving towards AGI). They are all trying to commercialise LLMs with some minor tweaks. I don't expect LLMs to make the kind of progress made by the first 3 iterations of GPT. And when the insanely hyped overvaluations crashed, the bubble WILL crash. You BETTER hope there is any money left to run this kind of tools at a profit or you will be back at Stackoverflow trying to relearn all the skills you lost using generative coding tools.
petesergeant 9 hours ago [-]
Is this distinct from using Gemini 2.5 Pro? If not, this doesn’t match my experience — I’ve been getting a lot of poorly designed TypeScript with an excess of very low quality comments.
christophilus 10 minutes ago [-]
The comments drive me nuts.
// Moved to foo.ts
Ok, great. That’s what git is for.
// Loop over the users array
Ya. I can read code at a CS101 level, thanks.
yahoozoo 2 hours ago [-]
Nice try, Mr. Google.
But seriously, yeah, Gemini is pretty great.
landl0rd 8 hours ago [-]
Really? I get goofy random substitutions like sometimes from foreign languages. It also doesn't do good with my mini-tests of "can you write modern Svelte without inserting React" and "can you fix a borrow-checking issue in Rust with lifetimes, not Arc/Cell slop"
That doesn't mean it's worse than the others just not much better. I haven't found anything that worked better than o1-preview so far. How are you using it?
insin 13 hours ago [-]
Is it just me or did they turn off reasoning mode in free Gemini Pro this week?
It's pretty useful as long as you hold it back from writing code too early, or too generally, or sometimes at all. It's a chronic over-writer of code, too. Ignoring most of what it attempts to write and using it to explore the design space without ever getting bogged down in code and other implementation details is great though.
I've been doing something that's new to me but is going to be all over the training data (subscription service using stripe) and have often been able to pivot the planned design of different aspects before writing a single line of code because I can get all the data it already has regurgitated in the context of my particular tech stack and use case.
energy123 6 hours ago [-]
They rolled out a new model a week ago which has a "bug" where in long chats it forgets to emit the tokens required for the UI to detect that it's reasoning. You can remind it that it needs to emit these tokens, which helps, or accept that it will sometimes fail to do it. I don't notice a deterioration in performance because it is still reasoning (you can tell by the nature of the output), it's just that those tokens aren't in <think> tags or whatever's required by the UI to display it as such.
CuriouslyC 12 hours ago [-]
I think reasoning in the studio is gated by load, and at the same time I wasn't seeing so much reasoning in AIstudio, I was getting vertex service overloaded calls pretty frequently on my agents.
Der_Einzige 6 hours ago [-]
Shhh!!! Normies will catch on and google will stop making it free.
But more seriously, they need to uncap temperature and allow more samplers if they want to really flex on their competition.
wetpaws 7 hours ago [-]
[dead]
ifellover 7 hours ago [-]
Absolutely agree. I really pushed it last week with a screenshot of a very abstract visualisation that we’d done in a Miro board of which we couldn’t find a library that did exactly what we wanted, so we turned to Gemini.
Essentially we were hoping to tie that to data inputs and have a system to regularly output the visualisation but with dynamic values. I bet my colleague it would one shot it: it did.
What I’ve also found is that even a sloppy prompt still somehow is reading my mind on what to do, even though I’ve expressed myself poorly.
Inversely, I’ve really found myself rejecting suggestions from ChatGPT, even o4-mini-high. It’s just doing so much random crap I didn’t ask and the code is… let’s say not as “Gemini” as I’d prefer.
levocardia 10 hours ago [-]
In one of Stephen Boyd's lectures on convex optimization, he has some quip like "if your optimization problem is computationally intractable, you could try really hard to improve the algorithm, or you could just go on vacation for a few weeks and by the time you get back, computers will be fast enough to solve it."
I feel like that's actually true now with LLMs -- if some query I write doesn't get one-shotted, I don't bother with a galaxy-brain prompt; I just shelve it 'til next month and the next big OpenAI/Anthropic/Google model will usually crush it.
user3939382 26 minutes ago [-]
Try getting it to write a codepen sim of 3 rectangles parallel parking.
mykowebhn 1 hours ago [-]
I understand from a technical POV how this could be considered great news.
But I don't see how this is good news at all from a societal POV.
The last 15 or so years has seen an unprecedented rise in salaries for engineers, especially software engineers. This has brought an interest in the profession from people who would normally not have considered SW as a profession. I think this is both good and bad. It has brought new found wealth to more people, but it may have also diluted the quality of the talent pool. That said, I think it was mostly good.
Now with this game-changing efficiency from these AI tools, I'm sure we've seen an end to the glory days in terms of salaries for the SW profession.
With this gone, where else could relatively normal people achieve financial independence? Definitely not in the service industry.
Very sad.
thicTurtlLverXX 49 minutes ago [-]
I understand how, from a technical POV, electricity and electrification could be considered great news.
But I don't see how this is good news at all from a societal POV.
Think about all the lamplighters who lost their jobs. Streetlights just turn on now? Lamplighting used to be considered a stable job! And what about the ice cutters…
For real tho, it's not like there's nothing left to do — we still have potholes to fix, t-shirts to fold and illnesses to cure.
Just the fact that many people continue to believe that wars are justified by resource scarcity shows we need technological progress.
mykowebhn 45 minutes ago [-]
From what I understand, prior to the 1980s/90s lamplighters, waiters, factory workers, etc. could live comfortable lives on decent wages.
These days not so much.
christophilus 5 minutes ago [-]
From what I understand, life was Dickensian hell for many people. Communism wouldn’t have had much of a chance if everyone was pretty much able to live a decent life as a lamp lighter.
user3939382 27 minutes ago [-]
I can’t reconcile statements like this with my experience trying to code with LLMs. As soon as there’s any real complexity they spit out nonsense broken code that in some cases could take a long time to debug. Then when you correct it “You’re totally right, I’ll change it so that x y z”. If you weren’t a senior dev with loads of experience you wouldn’t be able to debug or correct the code these tools produce.
mykowebhn 19 minutes ago [-]
If you were a new dev now learning the ropes, with these AI coding tools available, I highly doubt you would gain the same "loads of experience".
Learning comes through struggle and it's too easy to bypass that struggle now. It's so much easier to get the answers from AI.
zkry 59 minutes ago [-]
Im curious why there's this sentiment in regarding advances in AI. High level programming languages didnt in the least bit take away the value of the SW profession, despite allowing a vast number more people to write software.
The amount and complexity of software will expand to its very outer bounds for which specialists will be required.
rocqua 1 hours ago [-]
Sounds like there need to be measures to fix income inequality.
foldr 1 hours ago [-]
Software engineers earning enough to achieve financial independence are generally employed by FAANG or (indirectly) by venture capitalists who have more money than they know what to do with.
With all this money sloshing around, it takes only a little imagination to think of ways of channeling some of it to working people without employing them to write pointless (or in some cases actively harmful) software.
hakanito 4 hours ago [-]
The game changer for me will be when AI stops hallucinating SDK methods. I often find myself asking ”show me how to do advanced concept X in somewhat niche Y sdk”, and while it produces confident answers, 90% of the time it is suggesting SDK methods that do not exist, so a lot of time is wasted just arguing about that
M4v3R 4 hours ago [-]
The current method of solving this is providing the AI with the documentation of the SDKs your code uses. Current LLMs have quite big context windows so you can feed them a lot of documentation. Some tools can even crawl multipage documentation and index them for the use of LLMs.
hakanito 4 hours ago [-]
How do you do that practically/reliably? Would be great to just paste a link to the SDK Github repo, but doesn't seem to work (yet) in my experience
jstummbillig 4 hours ago [-]
Simple way would be to use either Sonnet 3.7/5 or Gemini 2.5 pro in windsurf/cursor/aider and tell it to search the web, when you know an SDK is problematic (usually because it's new and not in the training set).
That's all it takes to get reliably excellent results. It's not perfect, but, at this point, 90% hallucinations on normal SDK usage strongly suggests poor usage of what is the current state of the art.
stuaxo 4 hours ago [-]
Bring LLMs they always hallucinate an API it would be great to have.
If you had something on the other side to hallucinate the API itself you could have a program that dreams itself into existence as you use it.
flir 3 hours ago [-]
"I think callTheExactMethodINeed() was a hallucination. Can you try again?"
Then it apologizes and gives the right answer. It's weird. We really need a new work for what they're doing, 'cos it ain't thinking.
fourfun 7 hours ago [-]
Google may be getting AI to write good SQL, but they aren’t getting it to write good blog posts.
mritchie712 15 hours ago [-]
the short answer: use a semantic layer.
It's the cleanest way to give the right context and the best place to pull a human in the loop.
A human can validate and create all important metrics (e.g. what does "monthly active users" really mean) then an LLM can use that metric definition whenever asked for MAU.
With a semantic layer, you get the added benefit of writing queries in JSON instead of raw SQL. LLM's are much more consistent at writing a small JSON vs. hundreds of lines of SQL.
We[0] use cube[1] for this. It's the best open source semantic layer, but there's a couple closed source options too.
My last company wrote a post on this in 2021[2]. Looks like the acquirer stopped paying for the blog hosting, but the HN post is still up.
SELECT
users.state,
users.city,
orders.status,
sum(orders.count)
FROM orders
CROSS JOIN users
WHERE
users.state != 'us-wa'
AND orders.created_at BETWEEN '2020-01-01' AND '2021-01-01'
GROUP BY 1, 2, 3
LIMIT 10;
fhkatari 14 hours ago [-]
You move all the tools to debug and inspect slow queries, in a completely unsupported JSON environment, with prompts not to make up column names. And this is progress?
13 hours ago [-]
mritchie712 11 hours ago [-]
The JSON compiles to SQL. Have you used a semantic layer? You might have a different opinion if you tried one.
e3bc54b2 7 hours ago [-]
As someone who actually wrote a JSON to (limited) SQL transpiler at $DAYJOB, as much fun as I had designing and implementing that thing and for as many problems it solved immediately, 'tail wagging the dog' is the perfect description.
indymike 12 hours ago [-]
This may be the best comment on Hacker News ever.
IncreasePosts 10 hours ago [-]
You're right, it's a bit ridiculous. This is a perfect time to use xml instead of json.
meindnoch 2 hours ago [-]
Clearly the right solution is to use XML Object Notation, aka XON™!
We had an IT guy who once bought an XML<->JSON server for $12,000. Very proud of his rack of "data appliances". It made XML like XON out of JSON and JSON that was a soup of elements attributes and ___content___, thus giving you the complexity of XML in JSON. I don't think it got used once by our dev team, and I'm pretty sure it never processed a byte of anything of value.
andndnndd 10 hours ago [-]
[flagged]
dangscientist 14 hours ago [-]
[flagged]
tclancy 9 hours ago [-]
Mother of God. I can write JSON instead of a language designed for querying. What is the advantage? If I’m going to move up an abstraction layer, why not give me natural language? Lots of things turn a limited natural language grammar into SQL for you. What is JSON going to {do: for: {me}}?
Spivak 2 minutes ago [-]
I find it funny people are making fun of this while every ORM builds up an object representing the query and then compiles it to SQL. SQL but as a data structure you can manipulate has thousands of implementations because it solves a real problem. This time it's because LLMs have an easier time outputting complex JSON than SQL itself.
8n4vidtmkvmk 6 hours ago [-]
Sorry, I couldn't parse that. You didn't quote your keys
meindnoch 5 hours ago [-]
>you get the added benefit of writing queries in JSON instead of raw SQL
^ kids, this is what AI-induced brainrot looks like.
fkyimeanit 9 hours ago [-]
>you get the added benefit of writing queries in JSON instead of raw SQL
You should have written your comment in JSON instead of raw English.
jinjin2 3 hours ago [-]
I agree that using a semantic layer is the best way to get better precision. It is almost like a cheatsheet for the AI.
But I would never use one that forced me to express my queries in JSON. The best implementations integrate right into the database so they become an integral part of regular your SQL queries, and as such also available to all your tools.
In my experience, from using the Exasol Semantic Layer, it can be a totally seamless experience.
galenmarchetti 12 hours ago [-]
still need someone to build the semantic layer, why not use text2sql or something similar for that
danjc 6 hours ago [-]
The article comments "out of the box, LLMs are particularly good at tasks like creative writing" but I think this actually demonstrates the problem with the ai.
A writer won't think that they're good at creative writing. In fact, I'm pretty sure they'd think LLM's are terrible at creative writing.
In other words, to an expert in their field, they're not that good - at least not yet.
But to someone who is not an expert, they're unbelievably good - they're enabled to do something they had zero ability to do before.
randomNumber7 4 hours ago [-]
Yes, but why is then everyone on HN claiming LLMs can code on expert level?
danjc 3 hours ago [-]
Fast for hammering out boilerplate, great for understanding something you've never done before. Much less value for field-frontier or novel work.
__loam 3 hours ago [-]
I would posit that most people on hackernews are actually not that experienced.
insin 10 hours ago [-]
Is it too late to rescue the phrase "one-shotted" or is it already too far gone, like "AI" and "agent"?
troupo 6 hours ago [-]
For some reason I can't get the image of someone swinging back shots of vodka/tequila every time I see "one-shotted" out of my head
th0ma5 5 hours ago [-]
Reminds me the "crypto" name overloading. It is clear that fanboys are jealous of competence.
bob1029 12 hours ago [-]
For the problems where it would matter the most, these tools seem to help the least. The hardest problem domains don't have just one schema to worry about. They have hundreds. If you need to spin up a personal blog or todo list tracker, I have no doubt that Google, et. al. can take you exactly where you want to go.
galenmarchetti 12 hours ago [-]
and then add in ambiguity in the business terms / intention behind the query. still a big need for something like semantic layer or ontology to sit between business and at least right now that stuff hasn’t been automated away yet (it should be though)
mrtimo 11 hours ago [-]
Malloy [1] has a semantic layer [2]... and Model Context Protocol (MCP) support is being added through Publisher [3]. Something to keep an eye on. Seems like a great fit for LLMs.
I wonder if, for a given dialect (and even DDL), you could use that token masking technique similar to how that Structured Outputs [1] thing went:
Quote:
"While sampling, after every token, our inference engine will determine which tokens are valid to be produced next based on the previously generated tokens and the rules within the grammar that indicate which tokens are valid next. We then use this list of tokens to mask the next sampling step, which effectively lowers the probability of invalid tokens to 0. Because we have preprocessed the schema, we can use a cached data structure to do this efficiently, with minimal latency overhead."
I.e. mask any tokens that would produce something that isn't valid SQL in the given dialect, or further, a valid query for the given schema. I assume some structured outputs capability is latent to most assistants nowadays, so they probably already have explored this
Regarding the first issue: ” For example, even the best DBA in the world would not be able to write an accurate query to track shoe sales if they didn't know that cat_id2 = 'Footwear' in a pcat_extension table means that the product in question is a kind of shoe. The same is true for LLMs.”
I wish developers would make use of long table names and column names. For example, pcat_extension could have been named release_schema_1_0.product_category_extension. And cat_id2 could have been named category_id2.
rectang 12 hours ago [-]
> We will cover state-of-the-art [...] how we approach techniques that allows the system to offer virtually certified correct answers.
I don't need AI to generate perfect SQL, because I am never going to trust the output enough to copy/paste it — the risk of subtle semantic errors is too high, even if the code validates.
Instead, I find it helpful for AI to suggest approaches — after which I will manually craft the SQL, starting from scratch.
hsbauauvhabzb 11 hours ago [-]
Explain that to the average manager or junior engineer, both who don’t care about your desire to build well but not fast.
noosphr 11 hours ago [-]
> So now that we brought down prod for a day the new rule is no AI sql without three humans signing off on any queries.
Closi 10 hours ago [-]
If that’s the scenario, I would be asking why the testing pipeline didn’t catch this rather than why was the AI SQL wrong.
noosphr 10 hours ago [-]
Because the testing pipeline isn't the real database.
Anyone that knows a database well can bring it down with a innocent looking statement that no one else will blink at.
fkyimeanit 9 hours ago [-]
Because the testing pipeline was generated by AI, and code-reviewed by AI, reading a PR description generated by AI.
rectang 9 hours ago [-]
It’s not true that I want to build “well but not fast” — I’m trying to add value, and both speed and reliability matter. My productivity is high and I don’t have trouble articulating why; my approach has generally (though not universally) been well received by management and colleagues.
hosel 11 hours ago [-]
Really? In my experience it’s been pretty good (using Pydantic)! I read over before I execute it, but it’s never done anything malicious.
yahoozoo 1 hours ago [-]
What is the relevance of Pydantic with SQL?
rectang 11 hours ago [-]
I don't trust myself to craft a prompt in natural language which completely specifies my intent as codified with the precision of a programming language.
I also tend to turn to AI for advising me on difficult use cases, and most of the time it's for production code rather than one-offs. The easy cases, I just write myself because it's more mental effort to review code for subtle errors than it is to write it.
11 hours ago [-]
11 hours ago [-]
paulddraper 11 hours ago [-]
Hopefully your trust in yourself is warranted
rectang 9 hours ago [-]
I embrace my fallibility, and enthusiastically pursue testing, code reviews, staging environments, and so on to minimize the mistakes that make it through to production.
It seems to me that this skeptical mindset is consonant with handling AI output with care.
auggierose 7 hours ago [-]
You'd rather trust in AI than yourself?
malthaus 6 hours ago [-]
in writing good sql code? i definitely would
ai is not going to replace the senior sql expert with 20 years of battle experience in the short-term but support me who last dug into sql 15 years ago and needs to get a working sql query in a project. and ai usually does a better job than me copy pasting googled code in between quickly browsing through tutorials.
tango12 13 hours ago [-]
What’s the eventual goal of text to sql?
Is it to build a copilot for a data analyst or to get business insight without going through an analyst?
If it’s the latter - then imho no amount of text to sql sophistication will solve the problem because it’s impossible for a non analyst to understand if the sql is correct or sufficient.
These don’t seem like text2sql problems:
> Why did we hit only 80% of our daily ecommmerce transaction yesterday?
> Why is customer acquisition cost trending up?
> Why was the campaign in NYC worse than the same in SF?
phillipcarter 13 hours ago [-]
> These don’t seem like text2sql problems:
Correct, but I would propose two things to add to your analysis:
1. Natural language text is a universal input to LLM systems
2. text2sql makes the foundation of retrieving the information that can help answer these higher-level questions
And so in my mind, the goals for text2sql might be a copilot (near-term), but the long-term is to have a good foundation for automating text2sql calls, comparing results, and pulling them into a larger workflow precisely to help answer the kinds of questions you're proposing.
There's clearly much work needed to achieve that goal.
galenmarchetti 12 hours ago [-]
yeah I agree with this - good text2sql is essential but just one part of a larger stack that will actually get there. Seems possible tho
mynegation 13 hours ago [-]
To be fair, these don’t look like SQL problems either. SQL answers “what”, not “why” questions. The goal of text2sql is to free up analyst time to get through “what” much faster and - possibly- focus on “why” questions.
cdavid 13 hours ago [-]
My observation is the latter, but I agree the results fall short of expectations. Business will often want last minute change in reporting, don't get what they want at the right time because lack of analysts, and hope having "infinite speed" will solve the problem.
But ofc the real issue is that if your report metrics change last minute, you're unlikely to get good report. That's a symptom of not thinking much about your metrics.
Also, reports / analysis generally take time because the underlying data are messy, lots of business knowledge encoded "out of band", and poor data infrastructure. The smarter analytics leaders will use the AI push to invest in the foundations.
richardw 13 hours ago [-]
Any algo that a human would follow can be built and tested. If you have 10 analysts you have 10 different skill levels, with differing understanding of the database and business context. So automation gives you a platform to achieve a floor of skill and knowledge. The humans can now be “at least this good or better”. A new analyst instantly gets better, faster.
I assume a useful goal would be to guide development of the system in coordination with experts, test it, have the AI explain all trade offs, potential bugs, sense check it against expected results etc.
Taste is hard to automate. Real insight is hard to automate. But a domain expert who isn’t an “analyst” can go extremely far with well designed automation and a sense of what rational results should look like. Obviously the state of the art isn’t perfect but you asked about goals, so those would be my goals.
layer8 12 hours ago [-]
But “text to sql” isn’t an algorithm.
richardw 12 hours ago [-]
The processes the people want the sql for are likely filled with algo’s. An exec wants info in a known domain, set up a text to sql system with lots of context and testing to generate queries. If they think they have something good, get an expert to test and productionise it.
“Thank you for your request. Can you walk me through the steps you’d use to do this manually? What things would you watch out for? What kind of number ranges are reasonable? I can propose an algorithm and you tell me if that’s correct. The admins have set up guidelines on how to reason about customer and purchase data. Is the following consistent with your expectations?”
layer8 12 hours ago [-]
This is the same fallacy as low-code/no-code. If you have to check a precise algorithm, you’re effectively coding, and you need a language with the same precision as a programming language.
richardw 7 hours ago [-]
Only if you want a production-ready output. To get execs able to self-feed enough, this works fine. Look, you don’t see value until it’s perfect. Good, other people do. I see your fallacy and raise you a false dichotomy.
11 hours ago [-]
zeroq 9 hours ago [-]
Every once in a while I've been trying AI, since everyone and their mother told me to, so I comply.
My recent endevour was with Gemini 2.5:
- Write me a simple todo app on cloudflare with auth0 authentication.
- Here's a simple todo on cloudflare. We import the @auth0-cloudflare and...
- Does that @auth0-cloudflare exists?
- Oh, it doesn't. I can give you a walkthrough on how to set up an account on auth0. Would you like me to?
- Yes, please.
- Here. I'm going to write the walkthrough in a document... (proceed to create an empty document)
- That seems to be an empty document.
- Oh, my bad. I'll produce it once more. (proceed to create another empty document)
- Seems like you're md parsing library is broken, can you write it in chat instead?
- Yes... (your gemini trial has expired, would you like to pay $100 to continue?)
karencarits 7 hours ago [-]
It's difficult to assess how typical your experience is; I tried your initial prompt (`Write me a simple todo app on cloudflare with auth0 authentication.` on gemini-2.5-pro-preview-05-06) and didn't get any mentions of @auth0-cloudfare, although I cannot verify if the answer is working as-is
Shocked you got a different output from the stochastic token generator.
e3bc54b2 7 hours ago [-]
The worse part is not even being trolled at AI roundabout. The worse part is gaslighting by people who then go on to imply that I'm dumb to not be able to 'guide' the model 'towards the solution', whatever the fuck that means. And this is after telling me that model is so smart to just know what I want.
Claude and Gemini are pretty decent at providing a small and tight function definition with well defined parameters and output, but anything big and it starts losing shit left and right.
All vibecoding sessions I've seen have been pretty dead easy stuff with lot of boilerplate, maybe I'm weird for just not writing a lot of boilerplate and rely on well-built expressive abstractions..
floren 7 hours ago [-]
Remember, if AI couldn't solve your problem, you were probably using the wrong model. Did you try with o5-selfsuck-20250523-512B?
kubb 6 hours ago [-]
It’s brilliant because you can always shift the blame on the user. Wrong prompt, wrong model, should have used an agent and ran 3 models in parallel, etc.
Meanwhile we get claims that the tools are as capable as a junior programmer, and CEOs believe that.
palmfacehn 4 hours ago [-]
My least favorite part of this trend is the ageism. "Crusty curmudgeons are not up-to date with the latest bloat if they think RTFM is still a thing", "Oh, you didn't like ORMs? Did you try letting an AI generate code for your ORM?"
Maybe in the future all of these assistants will offer something amazing, but in my experience, there is more time invested in prompting that just reading the relevant documentation and having a coherent design.
My suspicion is that many, (but not all please no flames) of the biggest boosters of AI coding are simply inexperienced. If this is true, it makes sense that they wouldn't recognize the numerous foot-guns in AI generated code.
sensanaty 4 hours ago [-]
Yeah it's my favourite argument. Apparently this magical tool that can replace engineers and can do and write anything needs you to write prompts so detailed that you could have just written the damn code yourself, and probably had an easier time with it to boot.
The whole thing feels like we're in a collective delusion because idiotic managers and C-suites are blindly lapping up the advertising slop coming from the AI companies.
Kiro 5 hours ago [-]
You're the one doing the gaslighting now. "It doesn't work for me, therefore it can't possibly work for anyone else."
raincole 36 minutes ago [-]
At this point you should just take this as your secret weapon. Let people convince each other that AI can't do that thing, while you are one-shotting the exact thing with a cost of $0.05.
codr7 41 minutes ago [-]
Which is a very reasonable conclusion given the kinds of errors it makes.
Why are you so defensive about the tech?
Involved in any AI startups, perhaps?
zxexz 11 hours ago [-]
I find Gemini excellent for sql. Wouldn’t consider myself an expert in many things, but in sql and database design id consider myself close. I like writing queries and doing the architecture, and that’s where it’s exceptionally helpful. The massive context length combined with pointed questions means i can just dump the entire DDL, and ask “what am i missing?”. It really is an excellent tool for helping with times like checks and catching dumb errors on complex databases.
todotask2 3 hours ago [-]
Those days, we have many types of database tools—ORMs, query builders, and more. AI can help reduce the complexity and avoid lock-in to a specific tech stack. I love to write raw SQL.
iddan 5 hours ago [-]
For anybody wanting to use best-in-class AI SQL, I highly recommend checking out Sherloq (W23): https://www.sherloqdata.io/
msvana 6 hours ago [-]
Problem no. 2 (Understanding user intent) is relevant not only to writing SQL but also to software development in general. Follow-up questions are something I had in mind for a long time. I wonder why this is not the default for LLMs.
neuroelectron 9 hours ago [-]
No mention of knowing anything about the tables, versions or relational structure? Are we just assuming that's already given to the AI?
antman 12 hours ago [-]
This is on howto to to write good SELECTS, not SQL. AI is good enough to also create schemas from spec, migrate, explore databases, testing etc which tgis article does not touch upon
TechDebtDevin 11 hours ago [-]
Every time I've fed more than 5 migration files and asked Claude to make multiple across those files it fails, it does very badly in almost all cases, even on kinda basic schemas. I actually don't think LLMs grok complex migration files or sql that well at all.
fooker 10 hours ago [-]
Well that's a great startup idea if you're familiar with the domain.
stefap2 12 hours ago [-]
I have done this using the OpenAI 4o model. I had to pass in a prompt with business-specific instructions, industry jargon, and descriptions of tables, including foreign keys. Then it would generate even complex join queries and return data. In my case, I was more interested in providing results to users not knowledgeable about SQL, but the SQL was displayed for information.
JodieBenitez 5 hours ago [-]
Wait... people need AI to write SQL ?
randomNumber7 4 hours ago [-]
Most people here have not understood the relational model, so yes.
criddell 1 hours ago [-]
I’ve read most of Codd’s book on the subject and have written SQL on and off since the mid 90’s and I still need to look up the differences between the various joins anytime I use them.
treebeard901 9 hours ago [-]
If a lot of the value in a company is the software and over time a handful of AI companies start writing all the software, who really ends up owning all the value of the company?
wheelerwj 9 hours ago [-]
That’s easy. None of the value is in the software. The only value is in customers that use the software.
AdrianB1 13 hours ago [-]
In real life I find using AI for SQL dangerous. It allows people that don't know what they do to write queries that can significantly impact servers. In my world databases are relatively big for most developers, but not huge.
Sometimes when I want to fine tune a query I am challenging AI to provide a better solution. I give it the already optimized query and I ask for better. I never got a better answer, sometimes because AI is hallucinating or because the changes that it proposes are not working in a way that is beneficial, it is like an idiot parrot is telling what it overheard in the brothel - good info if it is a war brothel frequented by enemy officers in 1916, but not these days.
strict9 10 hours ago [-]
It should never be at the point where some random person can impact a server.
That's what read replicas with read-only access are for. Production db servers should not be open to random queries and usage by people. That's only for the app to use.
AdrianB1 3 hours ago [-]
How it should be and how it is, that depends on who is the decision maker. If the decision maker is a technical person, there is no gap, but in my case the decision maker is a non-technical manager with no competence to make such decisions, but that is the way the company is organized. So letting people use AI to dig through a 1 TB database is not a good idea, while not using AI prevents them to even try. Security by oblivion.
cheema33 12 hours ago [-]
> I give it the already optimized query and I ask for better. I never got a better answer..
This was my experience as well. However I have observed that things have been improving this regard. Newer LLMs do perform much better. And I suspect they will continue to get better over time.
cjbgkagh 10 hours ago [-]
I’ve been working on highly optimized code that heavily uses CPU intrinsics, a year ago no chance, 6 months ago a helpful reference, today it’s a good starting point. That is an insane pace of improvement.
ziml77 11 hours ago [-]
The strategy I've used with these people is to let them prototype with AI and then have them hand over their work to me where I can then make it significantly more efficient. The nice thing is that their poor performing version acts as a reference for validating the output of my queries.
awesome_dude 13 hours ago [-]
Mate, IME programmers who don't know what they are doing just do it anyways then look to blame someone/something else if things turn to custard.
AI is just increasing the frequency of things turning to custard :)
HideousKojima 12 hours ago [-]
AI is most effective as an accountability sink
scarface_74 10 hours ago [-]
> It allows people that don't know what they do to write queries that can significantly impact servers.
At least for the only OLAP DB I use often - Amazon Redshift - that’s a solved problem with Workload Management Queues. You can restrict those users ability to consume too many resources.
For queries that are used for OLTP, I usually try to keep those queries relatively simple. If there is a reason for read queries that consume resources , those go to read replicas when strong consistently isn’t required
mousetree 14 hours ago [-]
Out of all the AI tools and models I’ve tried, the most disappointing is the Gemini built into BigQuery. Despite having well named columns with good descriptions it consistently gets nowhere close to solving the problem.
flysand7 14 hours ago [-]
Having written more SQL than any other programming language by now, every time I've tried to use AI to write the query for me, I'd spend way more time getting the output right than if I'd just written it myself.
As a quick aside there's one thing I wish SQL had that would make writing queries so much faster. At work we're using a DSL that has one operator that automatically generates joins from foreign key columns, just like
credit.CLIENT->NAME
And you got clients table automatically joined into the query. Having to write ten to twenty joins for every query is by far the worst thing, everything else about writing SQL is not that bad.
emmelaich 10 hours ago [-]
I'd like there to be a function or macro for a bunch of joins, say
DEFINE products_by_order AS orders o JOIN order_items oi ON o.order_id = oi.order_id JOIN products p ON oi.product_id = p.product_id
You could make it visible to the DB rather than just a macro so it could optimise it by caching etc. Sort of like a view but on demand.
Although I think good enough language server / IDE could automatically insert the join when you typed `credit.CLIENT->NAME`
efromvt 13 hours ago [-]
(Shameless plug) writing the same joins over and over (and refactoring when you update stuff) was one of my biggest boilerplate annoyances with SQL - I’ve tried to fix that while still keeping the rest of SQL in https://trilogydata.dev/
galenmarchetti 12 hours ago [-]
yeah we’re doing something similar under the hood at AstroBee. it’s way way way easier to handle joins this way.
imo any hope of really leveraging llms in this context needs this + human review on additions to a shared ontology/semantic layer so most of the nuanced stuff is expressed simply and reviewed by engineering before business goes wild with it
quantadev 14 hours ago [-]
Having proper constraints and foreign keys that are clear is generally all that's needed in my experience. Are you sure your tables have well defined constraints, so that the AI can be absolutely 100% sure how everything links up? SQL is very precise, but only if you're utilizing constraints and foreign key definitions well.
carderne 13 hours ago [-]
It’s BigQuery, so it likely won’t have any of these.
quantadev 11 hours ago [-]
BigQuery supports all those SQL things I mentioned.
carderne 11 hours ago [-]
I’m just saying it’s likely they aren’t using them. But clearly you should if you want LLMs to do anything useful.
nashashmi 14 hours ago [-]
AI text to regex solutions would be incredibly handy.
RadiozRadioz 13 hours ago [-]
This comment appears frequently and always surprises me. Do people just... not know regex? It seems so foreign to me.
It's not like it's some obscure thing, it's absolutely ubiquitous.
Relatively speaking it's not very complicated, it's widely documented, has vast learning resources, and has some of the best ROI of any DSL. It's funny to joke that it looks like line noise, but really, there is not a lot to learn to understand 90% of the expressions people actually write.
It takes far longer to tell an AI what you want than to write a regex yourself.
nevf1 3 hours ago [-]
I respectfully disagree. Thankfully, I don't need to write regex much, so when I do it's always like it's the first time. I don't find the syntax particularly intuitive and I always rely on web-based or third party tools to validate my regex.
Whenever I have worked on code smells (performance issues, fuzzy test fails etc), regex was 3rd only to poorly written SQL queries, and/or network latency.
All-in-all, not a good experience for me. Regex is the one task that I almost entirely rely on GitHub Copilot in the 3-4 times a year I have to.
simonw 6 hours ago [-]
"It takes far longer to tell an AI what you want than to write a regex yourself."
My experience is the exact opposite. Writing anything but the simplest regex by hand still takes me significant time, and I've been using them for decades.
Getting an LLM to spit out a regex is so much less work. Especially since an LLM already knows the details of the different potential dialects of regex.
I use them to write regexes in PostgreSQL, Python, JavaScript, ripgrep... they've turned writing a regex from something I expect to involve a bunch of documentation diving to something I'll do on a whim.
Here's a recent example - my prompt included a copy of a PostgreSQL schema and these instructions:
Write me a SQL query to extract
all of my images and their alt
tags using regular expressions.
In HTML documents it should look
for either <img .* src="..." .*
alt="..." or <img alt="..." .*
src="..." (images may be self-
closing XHTML style in some
places). In Markdown they will
always be 
The markdown portion of that is a good example of the kind of regex I don't enjoy writing by hand, due to the need to remember exactly which characters to escape and how:
(REGEXP_MATCHES(commentary,
'!\[([^\]]*)\]\(([^)]*)\)', 'g'))[2] AS src,
(REGEXP_MATCHES(commentary,
'!\[([^\]]*)\]\(([^)]*)\)', 'g'))[1] AS alt_text
Perhaps Perl has given me Stockholm Syndrome, but when I look at your escaped regex example, it's extremely natural for me. In fact, I'd say it's a little too simple, because the LLM forgot to exclude unnecessary whitespace:
(REGEXP_MATCHES(commentary,
'!\[\s*([^\]]*?)\s*\]\(\s*([^)]*?)\s*\)', 'g'))[2] AS src,
(REGEXP_MATCHES(commentary,
'!\[\s*([^\]]*?)\s*\]\(\s*([^)]*?)\s*\)', 'g'))[1] AS alt_text
That is just nitpicking a one-off example though, I understand your wider point.
I appreciate the LLM is useful for problems outside one's usual scope of comfort. I'm mainly saying that I think it's a skill where the "time economics" really are in favor of learning it and expanding your scope. As in, it does not take a lot learning time before you're faster than the LLM for 90% of things, and those things occur frequently enough that your "learning time deficit" gets repaid quickly. Certainly not the case for all skills, but I truly believe regex is one of them due to its small scope and ubiquitous application. The LLM can be used for the remaining 10% of really complicated cases.
As you've been using regex for decades, there is already a large subset of problems where you're faster than the LLM. So that problem space exists, it's all about how to tune learning time to right-size it for the frequency the problems are encountered. Regex, I think, is simple enough & frequent enough where that works very well.
simonw 28 minutes ago [-]
> As in, it does not take a lot learning time before you're faster than the LLM for 90% of things, and those things occur frequently enough that your "learning time deficit" gets repaid quickly.
It doesn't matter how fast I get at regex, I still won't be able to type any but the shortest (<5 characters) patterns out quicker than an LLM can. They are typing assistants that can make really good guesses about my vaguely worded intent.
As for learning deficit: I am learning so much more thanks to heavy use of LLMs!
Prior to LLMs the idea of using a 100 line PostgreSQL query with embedded regex to answer a mild curiosity about my use of alt text would have finished at the idea stage: that's not a high value enough problem for me to invest more than a couple of minutes, so I would not have done it at all.
eddd-ddde 12 hours ago [-]
I know regex. But I use it so sparingly that every time I need it I forgot again the character for word boundary, or the character for whitespace, or the exact incantation for negative lookahead. Is it >!? who knows.
A shortcut to type in natural language and get something I can validate in seconds is really useful.
layer8 12 hours ago [-]
How do you validate it if you don’t know the syntax? Or are you saying that looking up syntax –> semantics is significantly quicker than semantics –> syntax? Which I don’t find to be the case. What takes time is grokking the semantics in context, which you have to do in both cases.
That doesn’t answer the question. By “validate”, I mean “prove to yourself that the regular expression is correct”. Much like with program code, you can’t do that by only testing it. You need to understand what the expression actually says.
widdershins 3 hours ago [-]
Testing something is the best way to prove that it behaves correctly in all the cases you can think of. Relying on your own (fallible) understanding is dangerous.
Of course, there may be cases you didn't think of where it behaves incorrectly. But if that's true, you're just as likely to forget those cases when studying the expression to see "what it actually says". If you have tests, fixing a broken case (once you discover it) is easy to do without breaking the existing cases you care about.
So for me, getting an AI to write a regex, and writing some tests for it (possibly with AI help) is a reasonable way to work.
marcosdumay 12 hours ago [-]
Notice that site has a very usable reference list you can consult for all those details the GP forgets.
CuriouslyC 12 hours ago [-]
I was using perl in the late 90s for sysadmin stuff, have written web scrapers in python and have a solid history with regex. That being said, AI can still write really complex lookback/lookahead/nested extraction code MUCH faster and with fewer bugs than me, because regex is easy to make small mistakes with even when proficient.
emmelaich 10 hours ago [-]
There's often a bunch of edge cases that people overlook. And you also get quadratic behaviour for some fairly 'simple' looking regexes that few people seem aware of.
insin 13 hours ago [-]
IME it's not just longer, but also more difficult to tell the LLM precisely what you want than to write it yourself if you need a somewhat novel RegExp, which won't be all over the training data.
I needed one to do something with Markdown which was a very internal BigCo thing to need to do, something I'd never have written without weird requirements in play. It wasn't that tricky, but going back trying to get LLMs to replicate it after the fact from the same description I was working from, they were hopeless. I need to dig that out again and try it on the latest models.
crystal_revenge 10 hours ago [-]
I personally didn’t really understand how to write regex until I understood “regular languages” properly, then it was obvious.
I’ve found that the vast majority of programmers today do not have any foundation in formal languages and/or the theory of computation (something that 10 years ago was pretty common to assume).
It used to be pretty safe to assume that everyone from perl hackers to computer science theorists understood regex pretty well, but I’ve found it’s increasingly a rare skill. While it used to be common for all programmers to understand these things, even people with a CS background view that as some annoying course they forgot as soon as the exam was over.
nashashmi 11 hours ago [-]
I use regex as an alternative to wildcards in various apps like notepad++ and vscode. The format is different in each app. And the syntax is somewhat different. I have to research it each time. And complex regex is a nightmare.
Which is why I would ask an AI to build it if it could.
jacob019 13 hours ago [-]
The first languge I used to solve real problems was perl, where regex is a first class citizen. In python less so, most of my python scripts don't use it. I love regex but know several developers who avoid it like plague. You don't know what you don't know, and there's nothing wrong with that. LLM's are super helpful for getting up to speed on stuff.
fooker 10 hours ago [-]
Regex, especially non standard (and non regular) extensions can be pretty tricky to grok.
took me 25.75 seconds, including learning how the website worked. I actually solved it in ~15 seconds, but I hadn't realized I got the correct answer becuase it was far too simple.
since you know so much regex, why dont you write a regex html parser /s
fkyimeanit 9 hours ago [-]
"Text to SQL", "text to regex", "text to shell", etc. will never fundamentally work because the reason we have computer languages is to express specific requirements with no ambiguity.
With an AI prompt you'll have to do the same thing, just more verbosely.
You will have to do what every programmer hates, write a full formal specification in English.
DonHopkins 4 hours ago [-]
Oh, so an AI assisted number of problems increaser?
there’s two kinds of people using AI to generate SQL…those who say it’s already solved and those who say it’ll be impossible to ever solve
quantadev 14 hours ago [-]
I agree. There's really no magic to it any more. The table create DDL commands are a very precise description of the tables, so almost nothing more is ever needed. You can just describe in detail what query you need, and any decent LLM can do it just fine.
> Even with a high-quality model, there is still some level of non-determinism or unpredictability involved in LLM-driven SQL generation. To address this we have found that non-AI approaches like query parsing or doing a dry run of the generated SQL complements model-based workflows well. We can get a clear, deterministic signal if the LLM has missed something crucial, which we then pass back to the model for a second pass. When provided an example of a mistake and some guidance, models can typically address what they got wrong.
Sounds like a bunch of bespoke not-AI work is being done to make up for LLM limitations that point blank can’t be resolved.
deadbabe 11 hours ago [-]
All this LLM written SQL stuff sounds great until you realize if you don’t really know SQL you won’t be able to debug or fix any broken SQL an LLM generates.
Thus, this is mainly just a tool for true experts to do less work and still get paid the same, not a tool for beginners to rise to the level of experts.
roywiggins 11 hours ago [-]
It depends, sometimes just feeding back broken SQL with "that didn't return any rows, can you fix it" and it comes up with something that works. Or "you're looking at the wrong entity, look at this table instead" or whatever, without knowing how to write competent SQL.
Obviously being able to at least read a bit of SQL and understanding the basic idea of relational databases helps loads.
harvey9 5 hours ago [-]
I'm not an expert but I've written SQL on and off for years. LLMs help me when I can describe my intent but can't think how to implement it. I don't expect a perfect solution just a starting point that I can refine.
bongodongobob 11 hours ago [-]
Have you not actually used LLMs? Just copy in the errors and away it goes.
deadbabe 10 hours ago [-]
Error goes away but it gives the wrong result.
LAC-Tech 11 hours ago [-]
If LLMs are so wonderful we can just read from B+ Tree storage engines directly. SQL, ORMs, Query Planners... all bloat.
gerdesj 11 hours ago [-]
"Given the prompt "I have a database schema that contains products and orders. Write a SQL query that shows the number of orders for shoes""
How on earth is this an AI job?
In the example you describe there are several technical things in nearly natural language and you mention two things that would be a drop down in a GUI. For starters this assumes you know what SQL is and your data layout or "schema".
Regardless of using AI, you need to understand the base technology.
SQL is not intractable for queries, once you have worked out the relationships. The relationship complexity will be the same for an AI prompt too no matter how cool you feel.
Your AI might find a customer 1:M shoes relationship or not. I suggest that anything beyond a couple of tables model will go horribly wrong.
josephg 11 hours ago [-]
If you know SQL, then yeah! But if you don't know SQL, using an AI to write a few queries & debug them is a great way to learn it.
I'm pretty comfortable with sql but still found it a fabulous tool recently. I have a sql database which describes a tree of some ~600k events. Each event is in a session (via session_id). Most events have a parent event - and trees of events can involve multiple sessions.
I wanted to add two derived columns to my events table. For each event, I wanted to name the root event for that event's tree and the root event within this session. I had code in typescript to do it - but unsurprisingly it was pretty slow. Well, it turns out you can write a recursive SQL query which can traverse the graph and populate those columns. I had no idea that was even possible.
ChatGPT managed it pretty well - though I ended up making a bunch of tweaks to the query it suggested to simplify it. I learned a bunch of SQL in the process - and that was cool! Obviously I could have read the SQL documentation and figured it out myself, but it was faster & easier using chatgpt. Writing SQL queries is a fantastic use case for LLMs.
11 hours ago [-]
leelou2 12 hours ago [-]
[dead]
curtisszmania 7 hours ago [-]
[dead]
getgalaxy 10 hours ago [-]
[dead]
gitroom 9 hours ago [-]
[dead]
Rendered at 12:23:05 GMT+0000 (UTC) with Wasmer Edge.
It leaves Claude and ChatGPT's coding looking like they are from a different century. It's hard to believe these changes are coming in factors of weeks and months. Last month i could not believe how good Claude is. Today I'm not sure how I could continue programming without Google Gemini in my toolkit.
Gemini AI Studio is such a giant leap ahead in programming I have to pinch myself when I'm using it.
Not really my idea of good.
The Omega Directive: https://snth.prose.sh/the_omega_directive
I tend to form the story arc in my head, and outline the major events in a timeline, and create very short summaries of important scenes, then use AI to turn those summaries into rough narrative outlines by asking me questions and then using my answers to fill in the details.
Next I'll feed that abbreviated manuscript into AI and brainstorm as to what's missing/where the flow could use improvement/etc with no consideration for prose quality, and start filling in gaps with new scenes until I feel like I have a compelling rough outline.
Then I just plow from beginning to end rewriting each chapter, first with AI to do a "beta" draft, then I rewrite significant chunks by hand to make things really sharp.
After this is done I'll feed the manuscript back into AI and get it to beta read given my target audience profile and ambitions for the book, and ask it to provide me feedback on how I can improve the book. Then I start editing based on this, occasionally adding/deleting scenes or overhauling ones that don't quite work based on a combination of my and AI's estimation. When Gemini starts telling me it can't think of much to improve the manuscript that's when it's time for human beta readers.
I've been wondering about what the legalities of the generated content are though since we know that a lot of the artistic source content was used without consent?C an I put the stories on my blog? Or, not that I wanted to, publish them? I guess people use AI generated code everywhere so I guess for practical purposes the cat is out the bag and won't be put back in again.
Apart from the apologising. It's silly when the AI apologises with ever more sincere apologies. There should be no apologies from AIs.
We used to serve others, but now people are so excited about serving themselves first that there's almost no talk of service to others at all anymore
we literally creating solution for our own problem
It depends, because you now have to pay in order to be able to compete against other programmers who're also using AI tools, it wasn't like that in what I'd call the true "golden age", basically the '90s - early part of the 2000s, when the internet was already a thing and one could put together something very cool with just a "basic" text editor.
Pay-as-you-go with Gemini does not snort your data for their own purposes (allegedly...).
I couldn’t find a way to use Gemini like a prepaid plan. I ain’t giving my credit card to Google for an LLM that can easily charge me hundreds or thousands of EUR.
Is there any concrete example that makes it really obvious? I had no such success with it so far and i would really like to see the clear cut between the gemini and the others.
https://developers.google.com/gemini-code-assist/docs/overvi...
What about Zed or something else?
I have not used any IDEs like Cursor or Zed, so I am not sure what I should be using (on Linux). I typically just get on Claude (claude.ai) or ChatGPT and do everything manually. It has worked fine for me so far, but if there is a way to reduce friction, I am willing to give it a try. I do not really need anything advanced, however. I just want to feed it the whole codebase (at times), some documentation, and then provide prompts. I mostly care about support for Claude and perhaps Gemini (would like to try it out).
Anyone else concerned about this kind of statements? Make no mistake, everyone. We are living in a LLM bubble (not an AI bubble as none of these companies are actually interested in AI as such as moving towards AGI). They are all trying to commercialise LLMs with some minor tweaks. I don't expect LLMs to make the kind of progress made by the first 3 iterations of GPT. And when the insanely hyped overvaluations crashed, the bubble WILL crash. You BETTER hope there is any money left to run this kind of tools at a profit or you will be back at Stackoverflow trying to relearn all the skills you lost using generative coding tools.
// Moved to foo.ts
Ok, great. That’s what git is for.
// Loop over the users array
Ya. I can read code at a CS101 level, thanks.
But seriously, yeah, Gemini is pretty great.
That doesn't mean it's worse than the others just not much better. I haven't found anything that worked better than o1-preview so far. How are you using it?
It's pretty useful as long as you hold it back from writing code too early, or too generally, or sometimes at all. It's a chronic over-writer of code, too. Ignoring most of what it attempts to write and using it to explore the design space without ever getting bogged down in code and other implementation details is great though.
I've been doing something that's new to me but is going to be all over the training data (subscription service using stripe) and have often been able to pivot the planned design of different aspects before writing a single line of code because I can get all the data it already has regurgitated in the context of my particular tech stack and use case.
But more seriously, they need to uncap temperature and allow more samplers if they want to really flex on their competition.
Essentially we were hoping to tie that to data inputs and have a system to regularly output the visualisation but with dynamic values. I bet my colleague it would one shot it: it did.
What I’ve also found is that even a sloppy prompt still somehow is reading my mind on what to do, even though I’ve expressed myself poorly.
Inversely, I’ve really found myself rejecting suggestions from ChatGPT, even o4-mini-high. It’s just doing so much random crap I didn’t ask and the code is… let’s say not as “Gemini” as I’d prefer.
I feel like that's actually true now with LLMs -- if some query I write doesn't get one-shotted, I don't bother with a galaxy-brain prompt; I just shelve it 'til next month and the next big OpenAI/Anthropic/Google model will usually crush it.
But I don't see how this is good news at all from a societal POV.
The last 15 or so years has seen an unprecedented rise in salaries for engineers, especially software engineers. This has brought an interest in the profession from people who would normally not have considered SW as a profession. I think this is both good and bad. It has brought new found wealth to more people, but it may have also diluted the quality of the talent pool. That said, I think it was mostly good.
Now with this game-changing efficiency from these AI tools, I'm sure we've seen an end to the glory days in terms of salaries for the SW profession.
With this gone, where else could relatively normal people achieve financial independence? Definitely not in the service industry.
Very sad.
But I don't see how this is good news at all from a societal POV.
Think about all the lamplighters who lost their jobs. Streetlights just turn on now? Lamplighting used to be considered a stable job! And what about the ice cutters…
For real tho, it's not like there's nothing left to do — we still have potholes to fix, t-shirts to fold and illnesses to cure. Just the fact that many people continue to believe that wars are justified by resource scarcity shows we need technological progress.
These days not so much.
Learning comes through struggle and it's too easy to bypass that struggle now. It's so much easier to get the answers from AI.
The amount and complexity of software will expand to its very outer bounds for which specialists will be required.
With all this money sloshing around, it takes only a little imagination to think of ways of channeling some of it to working people without employing them to write pointless (or in some cases actively harmful) software.
That's all it takes to get reliably excellent results. It's not perfect, but, at this point, 90% hallucinations on normal SDK usage strongly suggests poor usage of what is the current state of the art.
If you had something on the other side to hallucinate the API itself you could have a program that dreams itself into existence as you use it.
Then it apologizes and gives the right answer. It's weird. We really need a new work for what they're doing, 'cos it ain't thinking.
It's the cleanest way to give the right context and the best place to pull a human in the loop.
A human can validate and create all important metrics (e.g. what does "monthly active users" really mean) then an LLM can use that metric definition whenever asked for MAU.
With a semantic layer, you get the added benefit of writing queries in JSON instead of raw SQL. LLM's are much more consistent at writing a small JSON vs. hundreds of lines of SQL.
We[0] use cube[1] for this. It's the best open source semantic layer, but there's a couple closed source options too.
My last company wrote a post on this in 2021[2]. Looks like the acquirer stopped paying for the blog hosting, but the HN post is still up.
0 - https://www.definite.app/
1 - https://cube.dev/
2 - https://news.ycombinator.com/item?id=25930190
I’m sorry, I can’t. The tail is wagging the dog.
dang, can you delete my account and scrub my history? I’m serious.
JSON:
XON: It gives you all the flexibility of JSON with the mature tooling of XML!Edit: jesus christ, it actually exists https://sevenval.gitbook.io/flat/reference/templating/oxn
^ kids, this is what AI-induced brainrot looks like.
You should have written your comment in JSON instead of raw English.
But I would never use one that forced me to express my queries in JSON. The best implementations integrate right into the database so they become an integral part of regular your SQL queries, and as such also available to all your tools.
In my experience, from using the Exasol Semantic Layer, it can be a totally seamless experience.
A writer won't think that they're good at creative writing. In fact, I'm pretty sure they'd think LLM's are terrible at creative writing.
In other words, to an expert in their field, they're not that good - at least not yet.
But to someone who is not an expert, they're unbelievably good - they're enabled to do something they had zero ability to do before.
[1] https://www.malloydata.dev/ [2] https://docs.malloydata.dev/documentation/user_guides/malloy... [3] https://github.com/malloydata/publisher
Quote: "While sampling, after every token, our inference engine will determine which tokens are valid to be produced next based on the previously generated tokens and the rules within the grammar that indicate which tokens are valid next. We then use this list of tokens to mask the next sampling step, which effectively lowers the probability of invalid tokens to 0. Because we have preprocessed the schema, we can use a cached data structure to do this efficiently, with minimal latency overhead."
I.e. mask any tokens that would produce something that isn't valid SQL in the given dialect, or further, a valid query for the given schema. I assume some structured outputs capability is latent to most assistants nowadays, so they probably already have explored this
[1] https://openai.com/index/introducing-structured-outputs-in-t...
I wish developers would make use of long table names and column names. For example, pcat_extension could have been named release_schema_1_0.product_category_extension. And cat_id2 could have been named category_id2.
I don't need AI to generate perfect SQL, because I am never going to trust the output enough to copy/paste it — the risk of subtle semantic errors is too high, even if the code validates.
Instead, I find it helpful for AI to suggest approaches — after which I will manually craft the SQL, starting from scratch.
Anyone that knows a database well can bring it down with a innocent looking statement that no one else will blink at.
I also tend to turn to AI for advising me on difficult use cases, and most of the time it's for production code rather than one-offs. The easy cases, I just write myself because it's more mental effort to review code for subtle errors than it is to write it.
It seems to me that this skeptical mindset is consonant with handling AI output with care.
ai is not going to replace the senior sql expert with 20 years of battle experience in the short-term but support me who last dug into sql 15 years ago and needs to get a working sql query in a project. and ai usually does a better job than me copy pasting googled code in between quickly browsing through tutorials.
Is it to build a copilot for a data analyst or to get business insight without going through an analyst?
If it’s the latter - then imho no amount of text to sql sophistication will solve the problem because it’s impossible for a non analyst to understand if the sql is correct or sufficient.
These don’t seem like text2sql problems:
> Why did we hit only 80% of our daily ecommmerce transaction yesterday?
> Why is customer acquisition cost trending up?
> Why was the campaign in NYC worse than the same in SF?
Correct, but I would propose two things to add to your analysis:
1. Natural language text is a universal input to LLM systems
2. text2sql makes the foundation of retrieving the information that can help answer these higher-level questions
And so in my mind, the goals for text2sql might be a copilot (near-term), but the long-term is to have a good foundation for automating text2sql calls, comparing results, and pulling them into a larger workflow precisely to help answer the kinds of questions you're proposing.
There's clearly much work needed to achieve that goal.
But ofc the real issue is that if your report metrics change last minute, you're unlikely to get good report. That's a symptom of not thinking much about your metrics.
Also, reports / analysis generally take time because the underlying data are messy, lots of business knowledge encoded "out of band", and poor data infrastructure. The smarter analytics leaders will use the AI push to invest in the foundations.
I assume a useful goal would be to guide development of the system in coordination with experts, test it, have the AI explain all trade offs, potential bugs, sense check it against expected results etc.
Taste is hard to automate. Real insight is hard to automate. But a domain expert who isn’t an “analyst” can go extremely far with well designed automation and a sense of what rational results should look like. Obviously the state of the art isn’t perfect but you asked about goals, so those would be my goals.
“Thank you for your request. Can you walk me through the steps you’d use to do this manually? What things would you watch out for? What kind of number ranges are reasonable? I can propose an algorithm and you tell me if that’s correct. The admins have set up guidelines on how to reason about customer and purchase data. Is the following consistent with your expectations?”
My recent endevour was with Gemini 2.5:
https://pastebin.com/yfg0Zn0u
Claude and Gemini are pretty decent at providing a small and tight function definition with well defined parameters and output, but anything big and it starts losing shit left and right.
All vibecoding sessions I've seen have been pretty dead easy stuff with lot of boilerplate, maybe I'm weird for just not writing a lot of boilerplate and rely on well-built expressive abstractions..
Meanwhile we get claims that the tools are as capable as a junior programmer, and CEOs believe that.
Maybe in the future all of these assistants will offer something amazing, but in my experience, there is more time invested in prompting that just reading the relevant documentation and having a coherent design.
My suspicion is that many, (but not all please no flames) of the biggest boosters of AI coding are simply inexperienced. If this is true, it makes sense that they wouldn't recognize the numerous foot-guns in AI generated code.
The whole thing feels like we're in a collective delusion because idiotic managers and C-suites are blindly lapping up the advertising slop coming from the AI companies.
Why are you so defensive about the tech?
Involved in any AI startups, perhaps?
Sometimes when I want to fine tune a query I am challenging AI to provide a better solution. I give it the already optimized query and I ask for better. I never got a better answer, sometimes because AI is hallucinating or because the changes that it proposes are not working in a way that is beneficial, it is like an idiot parrot is telling what it overheard in the brothel - good info if it is a war brothel frequented by enemy officers in 1916, but not these days.
That's what read replicas with read-only access are for. Production db servers should not be open to random queries and usage by people. That's only for the app to use.
This was my experience as well. However I have observed that things have been improving this regard. Newer LLMs do perform much better. And I suspect they will continue to get better over time.
AI is just increasing the frequency of things turning to custard :)
At least for the only OLAP DB I use often - Amazon Redshift - that’s a solved problem with Workload Management Queues. You can restrict those users ability to consume too many resources.
For queries that are used for OLTP, I usually try to keep those queries relatively simple. If there is a reason for read queries that consume resources , those go to read replicas when strong consistently isn’t required
As a quick aside there's one thing I wish SQL had that would make writing queries so much faster. At work we're using a DSL that has one operator that automatically generates joins from foreign key columns, just like
And you got clients table automatically joined into the query. Having to write ten to twenty joins for every query is by far the worst thing, everything else about writing SQL is not that bad.Although I think good enough language server / IDE could automatically insert the join when you typed `credit.CLIENT->NAME`
imo any hope of really leveraging llms in this context needs this + human review on additions to a shared ontology/semantic layer so most of the nuanced stuff is expressed simply and reviewed by engineering before business goes wild with it
It's not like it's some obscure thing, it's absolutely ubiquitous.
Relatively speaking it's not very complicated, it's widely documented, has vast learning resources, and has some of the best ROI of any DSL. It's funny to joke that it looks like line noise, but really, there is not a lot to learn to understand 90% of the expressions people actually write.
It takes far longer to tell an AI what you want than to write a regex yourself.
Whenever I have worked on code smells (performance issues, fuzzy test fails etc), regex was 3rd only to poorly written SQL queries, and/or network latency.
All-in-all, not a good experience for me. Regex is the one task that I almost entirely rely on GitHub Copilot in the 3-4 times a year I have to.
My experience is the exact opposite. Writing anything but the simplest regex by hand still takes me significant time, and I've been using them for decades.
Getting an LLM to spit out a regex is so much less work. Especially since an LLM already knows the details of the different potential dialects of regex.
I use them to write regexes in PostgreSQL, Python, JavaScript, ripgrep... they've turned writing a regex from something I expect to involve a bunch of documentation diving to something I'll do on a whim.
Here's a recent example - my prompt included a copy of a PostgreSQL schema and these instructions:
I ended up with 100 lines of SQL: https://gist.github.com/simonw/5b44a662354e124e33cc1d4704cdb...The markdown portion of that is a good example of the kind of regex I don't enjoy writing by hand, due to the need to remember exactly which characters to escape and how:
Full prompt and notes here: https://simonwillison.net/2025/Apr/28/dashboard-alt-text/I appreciate the LLM is useful for problems outside one's usual scope of comfort. I'm mainly saying that I think it's a skill where the "time economics" really are in favor of learning it and expanding your scope. As in, it does not take a lot learning time before you're faster than the LLM for 90% of things, and those things occur frequently enough that your "learning time deficit" gets repaid quickly. Certainly not the case for all skills, but I truly believe regex is one of them due to its small scope and ubiquitous application. The LLM can be used for the remaining 10% of really complicated cases.
As you've been using regex for decades, there is already a large subset of problems where you're faster than the LLM. So that problem space exists, it's all about how to tune learning time to right-size it for the frequency the problems are encountered. Regex, I think, is simple enough & frequent enough where that works very well.
It doesn't matter how fast I get at regex, I still won't be able to type any but the shortest (<5 characters) patterns out quicker than an LLM can. They are typing assistants that can make really good guesses about my vaguely worded intent.
As for learning deficit: I am learning so much more thanks to heavy use of LLMs!
Prior to LLMs the idea of using a 100 line PostgreSQL query with embedded regex to answer a mild curiosity about my use of alt text would have finished at the idea stage: that's not a high value enough problem for me to invest more than a couple of minutes, so I would not have done it at all.
A shortcut to type in natural language and get something I can validate in seconds is really useful.
Of course, there may be cases you didn't think of where it behaves incorrectly. But if that's true, you're just as likely to forget those cases when studying the expression to see "what it actually says". If you have tests, fixing a broken case (once you discover it) is easy to do without breaking the existing cases you care about.
So for me, getting an AI to write a regex, and writing some tests for it (possibly with AI help) is a reasonable way to work.
I needed one to do something with Markdown which was a very internal BigCo thing to need to do, something I'd never have written without weird requirements in play. It wasn't that tricky, but going back trying to get LLMs to replicate it after the fact from the same description I was working from, they were hopeless. I need to dig that out again and try it on the latest models.
I’ve found that the vast majority of programmers today do not have any foundation in formal languages and/or the theory of computation (something that 10 years ago was pretty common to assume).
It used to be pretty safe to assume that everyone from perl hackers to computer science theorists understood regex pretty well, but I’ve found it’s increasingly a rare skill. While it used to be common for all programmers to understand these things, even people with a CS background view that as some annoying course they forgot as soon as the exam was over.
Which is why I would ask an AI to build it if it could.
http://alf.nu/RegexGolf?world=regex&level=r00
took me 25.75 seconds, including learning how the website worked. I actually solved it in ~15 seconds, but I hadn't realized I got the correct answer becuase it was far too simple.
This website is much better https://regexcrossword.com/challenges/experienced/puzzles/e9...
https://blog.codinghorror.com/parsing-html-the-cthulhu-way/
https://en.wikipedia.org/wiki/Beautiful_Soup_(HTML_parser)
With an AI prompt you'll have to do the same thing, just more verbosely.
You will have to do what every programmer hates, write a full formal specification in English.
https://blog.codinghorror.com/regular-expressions-now-you-ha...
https://blog.codinghorror.com/parsing-html-the-cthulhu-way/
https://en.wikiquote.org/wiki/Jamie_Zawinski
Step 2...
> awesome-Text2SQL: https://github.com/eosphoros-ai/Awesome-Text2SQL
> Awesome-code-llm > Benchmarks > Text to SQL: https://github.com/codefuse-ai/Awesome-Code-LLM#text-to-sql
Sounds like a bunch of bespoke not-AI work is being done to make up for LLM limitations that point blank can’t be resolved.
Thus, this is mainly just a tool for true experts to do less work and still get paid the same, not a tool for beginners to rise to the level of experts.
Obviously being able to at least read a bit of SQL and understanding the basic idea of relational databases helps loads.
How on earth is this an AI job?
In the example you describe there are several technical things in nearly natural language and you mention two things that would be a drop down in a GUI. For starters this assumes you know what SQL is and your data layout or "schema".
Regardless of using AI, you need to understand the base technology.
SQL is not intractable for queries, once you have worked out the relationships. The relationship complexity will be the same for an AI prompt too no matter how cool you feel.
Your AI might find a customer 1:M shoes relationship or not. I suggest that anything beyond a couple of tables model will go horribly wrong.
I'm pretty comfortable with sql but still found it a fabulous tool recently. I have a sql database which describes a tree of some ~600k events. Each event is in a session (via session_id). Most events have a parent event - and trees of events can involve multiple sessions.
I wanted to add two derived columns to my events table. For each event, I wanted to name the root event for that event's tree and the root event within this session. I had code in typescript to do it - but unsurprisingly it was pretty slow. Well, it turns out you can write a recursive SQL query which can traverse the graph and populate those columns. I had no idea that was even possible.
ChatGPT managed it pretty well - though I ended up making a bunch of tweaks to the query it suggested to simplify it. I learned a bunch of SQL in the process - and that was cool! Obviously I could have read the SQL documentation and figured it out myself, but it was faster & easier using chatgpt. Writing SQL queries is a fantastic use case for LLMs.