@korazail

korazail@lemmy.myserv.one · 2 days ago

Thanks for your reply, and I can still see how it might work.

I’m curious if you have any resources that do some end-to-end examples. This is where I struggle. If I have an atomic piece of code I need and I can maybe get it started with a LLM and finish it by hand, but anything larger seems to just always fail. So far the best video I found to try a start-to-finish demo was this: https://www.youtube.com/watch?v=8AWEPx5cHWQ

He spends plenty of time describing the tools and how to use them, but when we get to the actual work, we spend 20 minutes telling the LLM that it’s doing stuff wrong. There’s eventually a prototype, but to get there he had to alternate between ‘I still can’t jump’ and ‘here’s the new error.’ He eventually modified code himself, so even getting a ‘mario clone’ running requires an actual developer and the final result was underwhelming at best.

For me, a ‘game’ is this tiny product that could be a viable unit. It doesn’t need to talk to other services, it just needs to react to user input. I want to see a speed-run of someone using LLMs to make a game that is playable. It doesn’t need to be “fun”, but the video above only got to the ‘player can jump and gets game over if hitting enemy’ stage. How much extra effort would it take to make the background not flat blue? Is there a win condition? How to refactor this so that the level is not hard-coded? Multiple enemy types? Shoot a fireball that bounces? Power Ups? And does doing any of those break jump functionality again? How much time do I have to spend telling the LLM that the fireball still goes through the floor and doesn’t kill an enemy when it hits them?

I could imagine that if the LLM was handed a well described design document and technical spec that it could do better, but I have yet to see that demonstrated. Given what it produces for people publishing tutorials online, I would never let it handle anything business critical.

The video is an hour long, and spends about 20 minutes in the middle actually working on the project. I probably couldn’t do better, but I’ve mostly forgotten my javascript and HTML canvas. If kaboom.js was my focus, though, I imagine I could knock out what he did in well under 20 minutes and have a better architected design that handled the above questions.

I’ve, luckily, not yet been mandated that I embed AI into my pseudo-developer role, but they are asking.

korazail@lemmy.myserv.one · 3 days ago

I think this is what will kill vibe coding, but not before there’s significant damage done. Junior developers will be let go and senior devs will be told they have to use these tools instead and to be twice as efficient. At some point enough major companies will have had data breaches through AI-generated code that they all go back to using people, but there will be tons of vulnerable code everywhere. And letting Cursor touch your codebase for a year, even with oversight, will make it really tricky to find all the places it subtly fucked up.

korazail@lemmy.myserv.one · 3 days ago

I have 3 questions, and I’m coming from a heavily AI-skeptic position, but am open:

Do you believe that providing all that context, describing the existing patterns, creating an implementation plan, etc, allows the AI to both write better code and faster than if you just did it yourself? To me, this just seems like you have to re-write your technical documentation in prose each time you want to do something. You are saying this is better than ‘Do XYZ’, but how much twiddling of your existing codebase do you need to do before an AI can understand the business context of it? I don’t currently do development on an existing codebase, but every time I try to get these tools to do something fairly simple from scratch, they just flail. Maybe I’m just not spending the hours to build my AI-parsable functional spec. Every time I’ve tried this, asking something as simple as (and paraphrased for brevity) “write an Asteroids clone using JavaScript and HTML 5 Canvas” results in a full failure, even with multiple retries chasing errors. I wrote something like that a few years ago to learn Javascript and it took me a day-ish to get something that mostly worked.
Speaking of that context. Are you running your models locally, or do you have some cloud service? If you give your entire codebase to a 3rd party as context, how much of your company’s secret sauce have you disclosed? I’d imagine most sane companies are doing something to make their models local, but we see regular news articles about how ChatGPT is training on user input and leaking sensitive data if you ask it nicely and I can’t imagine all the pro-AI CEOs are aware of the risks here.
How much pen-testing time are you spending on this code, error handling, edge cases, race conditions, data sanitation? An experienced dev understands these things innately, having fixed these kinds of issues in the past and knows the anti-patterns and how to avoid them. In all seriousness, I think this is going to be the thing that actually kills AI vibe coding, but it won’t be fast enough. There will be tons of new exploits in what used to be solidly safe places. Your new web front-end? It has a really simple SQL injection attack. Your phone app? You can tell it your username is admin’[email protected] and it’ll let you order stuff for free since you’re an admin.

I see a place for AI-generated code, for instant functions that do something blending simple and complex. “Hey claude, write a function to take a string and split it at the end of every sentence containing an uppercase A”. I had to write weird functions like that constantly as a sysadmin, and transforming data seems like a thing an AI could help me accelerate. I just don’t see that working on a larger scale, though, or trusting an AI enough to allow it to integrate a new function like that into an existing codebase.

korazail@lemmy.myserv.one · 3 days ago

I’d wager that the votes are irrelevant. Stock overflow is generously <50% good code and is mostly people saying ‘this code doesn’t work – why?’ and that is the corpus these models were trained on.

I’ve yet to see something like a vibe coding livestream where something got done. I can only find a lot of ‘tutorials’ that tell how to set up tools. Anyone want to provide one?

I could… possibly… imagine a place where someone took quality code from a variety of sources and generate a model that was specific to a single language, and that model was able to generate good code, but I don’t think we have that.

Vibe coders: Even if your code works and seems to be a success, do you know why it works, how it works? Does it handle edge cases you didn’t include in your prompt? Does it expose the database to someone smarter than the LLM? Does it grant an attacker access to the computer it’s running on, if they are smarter than the LLM? Have you asked your LLM how many 'r’s are in strawberry?

At the very least, we will have a cyber-security crisis due to vibe coding; especially since there seems to be a high likelihood of HR and Finance vibe coders who think they can do the traditional IT/Dev work without understanding what they are doing and how to do it safely.

korazail@lemmy.myserv.one · 15 days ago

This is my fear. It’s still possible, barely, to buy a dumb TV. When my current fridge/dishwasher/stove/etc dies in a few years, will there even be a dumb version? Will it cost 5x the price of a spyware version? How about my thermostat. HVAC? Car? And will attempting to disable any of this spyware land me in prison?

Right now, uninformed/unaware/stupid people are affected by this. Pretty soon, everyone will be, or they will have to forego things we consider to be necessities now, like refrigeration and cell phones or be rich enough to buy the privacy-focused models.

I can’t immediately find it, but I just saw another post about a new privacy-focused cellphone with a huge price tag. The established manufacturers have a cost advantage. Samsung et al. can easily make a new fridge with fewer consumer rights, but a new company will have to spend tons of capital to make a factory to put out a comparable product; and they won’t have the advantage of selling your data to subsidize the price.

Privacy is and will become more-so a commodity unless we fight for it.

korazail@lemmy.myserv.one · 4 months ago

Like many things, a tool is only as smart as the wielder. There’s still a ton of critical thinking that needs to happen as you do something as simple as bake bread. Using an AI tool to suggest ingredients can be useful from a creative perspective, but should not be assumed accurate at face value. Raisins and Dill? maybe ¯\(ツ)/¯, haven’t tried that one myself.

I like AI, for being able to add detail to things or act as a muse, but it cannot be trusted for anything important. This is why I’m ‘anti-AI’. Too many people (especially in leadership roles) see this tool as a solution for replacing expensive humans with something that ‘does the thinking’; but as we’ve seen elsewhere in this thread, AI CANT THINK. It only suggests items that are statistically likely to be next/near based on its input.

In the Security Operations space, we have a phrase “trust but verify”. For anything AI, I would use 'doubt, then verify" instead. That all said. AI might very well give you a pointer to the place to ask how much motrin an infant should get. Hopefully, that’s your local pediatrician.

korazail@lemmy.myserv.one · 5 months ago

I think there is potential that this was intended.

PalWorld was SO on the nose modeled after pokemon plus Breath of the Wild that it couldn’t be anything but a stab at Nintendo. And yet, it seems that (I’m not a lawyer) they skirted around ever actually infringing on copyrights. If you want to build a zoo full of creatures, there are only so many ways you can combine things without making a fire dog or ice dragon, and then comparisons can be made. PalWorld has many creatures that I don’t recognize as being similar to existing pokemon. Given that Nintendo has not gone after PalWorld for copyright infringement, I’d say that means they don’t have a case.

Patents are another angle, and I’m far from a patent lawyer. Have you ever read one? They are full of jargon and what seem to be nonsense words, especially a software patent for a video game. I found an article that describes how Nintendo can use a ‘new’ patent to attack PalWorld, but near the end he clearly calls out that there is a difference between ‘legal’ and ‘legitimate.’ I can’t seem to find the actual ‘throwing a ball to make a thing happen’ new patent, but I’d assume PalWorld doesn’t infringe the original patent, or Nintendo would have just used that one. The article author also notes how Nintendo applied for a divisional patent near the end of a window for doing so, which presumably extends the total lifetime of the patent protection. A new divisional patent last year probably means we have 40 years of no ‘ball-throwing mechanics.’

I hope that this whole thing is a stunt. PalWorld was commercially successful, and even if they lose and have to modify the game, it will remain successful. I think that there’s a possibility that the developer and publisher are fighting against software patents kind of in general and used PalWorld as bait that Nintendo fell for.

If they lose, then there will be a swath of gamers who are at least mildly outraged at software patents. Popular opinion can (occasionally) sway policy.

If they win, then we have another chink in the armor of software patents as a whole. See Google vs Oracle regarding the ability to patent an API.

If we can manage to kill software patents for gameplay mechanics, like throwing balls at things, being able to take off and land seamlessly, or having a recurring enemy taunt you, then we get better games that remix things that worked.

Imagine how terribly different games would be if someone had patented “A action where a user presses a button to swing their weapon, and if that weapon hits an enemy, that enemy takes damage.”