Non-engineers AI coding & corporate compliance?
Part of my role at work is in security policy & implementation. I can't figure this out so maybe someone will have some advice.
With the advent of AI coding, people who don't know how to code now start to use the AI to automate their work. This isn't new - previously they might use already other low code tools like Excel, UIPath, n8n, etc. but it still require learning the tools to use it. Now, anyone can "vibe coding" and get an output, which is fine for engineers who understand how the output should work and can design how it should be tested (edge cases, etc.)
I had a team come up with me that they managed to automate their work, which is good, but they did it with ChatGPT and the code works as they expected, but they doesn't fully understand how the code works and of course they're deploying this "to production" which means they're setting up an environment that supposed to be for internal tools, but use real customer data fed in from the production systems.
If you're an engineer, usually this violates a lot of policies - you should get the code peer reviewed by people who know what it does (incl. business context), the QA should test the code and think about edge cases and the best ways to test it and sign it off, the code should be developed & tested in non-production environment with fake data.
I can't think of a way non-engineers can do this - they cannot read code (and it get worse if you need two people in the same team to review each other) and if you're outsourcing it to AI, the AI company doesn't accept liability, nor you can retrain the AI from postmortems. The only way is to include lessons learned into the prompt, and I guess at some point it will become one long holy bible everyone has to paste into the limited context window. They are not trained to work on non-production data (if you ever try, usually they'll claim that the data doesn't match production - which I think because they aren't trained to design and test for edge cases). The only way to solve this directly is asking engineers to review them, but engineers aren't cheap and they're best doing something more important.
So far I think the best way to approach this problem is to think of it like Excel - the formulas are always safe to use - they don't send data to the internet, they don't create malware, etc. The worst think they can do is probably destroy that file or hangs your PC. And people don't know how to write VBA so they never do it. Now you have people copy pasting VBA code that they don't understand. The new AI workspace has to be done by building technical guardrails that the AI are limited to. I think it has to be done in some low-code tools that people using AI has to use (like say n8n). For example, blocks that do computation can be used, blocks that send data to the intranet/internet or run arbitrary code requires approval before use. And engineers can build safe blocks that can be used, such as sending messages to Slack that can only be used to send to corporate workspace only.
Does your work has adjusted policies for this AI epidemic? or other ideas that you wanted to share?
For some context, I am approaching this as someone who works with code on a daily basis and has experimented with vibe coding to get an idea of the quality of code involved.
In my mind, only one thing here can really be true. To be more specific, they can only tell on a surface level if the code seems to do what they expect it to do. They have no clue about any shenanigans going on inside that might result in valid looking output that is actually false.
You are comparing it to "other" low code tools, where I'd say this is far from what we typically understand to be low code tools. This because there is fluid raw code involved, which is much more flexible and unpredictable compared to actual low code tools. Those low code tools have limitations in place specifically to limit a lot of risks.
To be frank, I don't see any way to allow tools like this without involving actual developers reviewing the resulting code. Without that sort of review process running any vibe based product is effectively the same as blindly downloading executables and running them without any due diligence.
This is not a thing, like at all. The big AI companies already have a lot of trouble making these models follow specific guidelines and people find all sorts of creative or simply accidental ways around these limitations.
Again, without the code being vetted by humans who know what they are doing there is no sane way to allow vibe coding based applications in production environments.
Now, before any C-suites get any ideas. By human review I don't mean outsourcing that to the cheapest company offering such a service out there because they likely will be using LLMs anyway to cut corners. If they are not they are not going to employ people who are on a technical level to actually do a proper risk assessment.
Simply because I don't know any developer who likes to only review code and certainly not code purely made by LLMs.
So even with a developer involved to vet the code you are taking, in my opinion, insane risks allowing these products to be used.
Edit:
One clarification that I'd like to make. Me calling the risk insane is based on the remark that customer data is involved.
And we should all remember that it often takes longer to review code you did not write than to write your own code.
Particularly if that code has useless or nonexistent comments.
This is the hidden cost of vibe coding. Which will be ignored until a new 'largest worst data breach in history' comes along.
You're far more optimistic than me. Has much changed after the Target and Equifax breaches?
From some previous research, Target was technically compliant with PCI DSS at the time of the breach. Updated standards have not changed in a way that would have invalidated that. I'm not sure if they have actually implemented network isolation at all of their locations, and that certainly would not be publicly shared.
And Equifax is still one of the big three credit reporting agencies.
What you're describing sounds like it is (or could evolve into) "shadow IT." If a business unit feels too locked down by IT, but somehow have the availability to build their own apps/workarounds, there's always a risk they could do it.
You need to look at this from two different angles:
Why are they unhappy with the current state of things? Does one of their business applications not fully meet their needs? Were they promised custom applications by IT, but not having that delivered in a reasonable timeframe?
If they shouldn't be doing any development outside of sanctioned environments (low-code environments it seems, based on your post), what is allowing them to do it? Do they need additional restrictions on client PCs to stop it? Firewall rules to block access to ChatGPT? Additional screening before allowing unreviewed code or applications to run on production servers?
Ultimately, business units are just trying to get their job done. They aren't doing this because they want to spite IT (though sometimes past events lead them to develop negative perceptions of IT). Try to work backwards with them - identify their desired end result, then evaluate how they can reach it. With existing tooling is ideal, but if that doesn't work, explore the viability of other options.
In this vein, it’s probably wise to really underline and explain the risks of using ChatGPT, as part of the conversations around those two questions. If you’re already listening to their needs, people should be at their most receptive to understanding in turn why you’re worried about the vibe coding approach and the data security angle.
Compliance will never be 100% perfect, but getting a shared understanding of the risks should help a lot. Draw an analogy with the legal department: it’d often be faster and easier to make an agreement directly with the client in a two sentence email, but do you want to be the one who’s responsible for that when it gets everyone sued? Or do you happily let the lawyers take their time to go over the details so it’s much less likely to come to that, and it’s their problem if someone does get sued? Same goes for tech, but it’s production outages and accounting errors and data breaches rather than lawsuits. Although probably also lawsuits, to be honest.
Re. security, if you're concerned about network access, it's always possible to mandate that all vibe coded, unreviewed software has to run in sandboxes (VMs, jails, etc.) that have strict access controls which are defined by your security department. If it can persist any data, though, you'll have to be concerned with GDPR compliance.
Below is a rant. I'm tired with the software development industry. Please skip it if you're still trying to stay sane in this career; I'm just old and jaded and so very tired.
A sad rant by a sad soul
That said, smaller companies don't get sued all that often under consumer protection acts! And even massive leaks of data (e.g. that Transunion leak, apparently a Floridian background check company last year, etc. just search "SSN breach" on Google) are often only punished with a slap on the wrist. You can probably find some private insurance which is willing to cover your company for negligence (e.g. allowing vulnerable people to be tracked and harmed, then paying out to their families), then to eat that as a cost of doing business. And if you have lawyers on retainer anyhow, using them to stall out incoming lawsuits and negotiate down settlements is a no brainer! You could even set up a subsidiary that owns all work done with LLM-generated code, making it even harder for damages to reach the parent company.
Societally, we missed the chance to regulate software development at all. Other engineering disciplines have professional standards for those who intend to practice it, and there are far reaching, industry-wide regulations on how to do so safely. Practitioners are required to continually educate themselves on best practices, and are held liable for failures in the work that they create. By contrast, despite software being capable of ruining millions of lives with the click of a button, we allow teams of entirely unqualified people to crank out products whilst strung out on stimulants. Things have been bad for a while, and I cannot fathom how they could ever improve (unless we have a literal AI uprising. That would be nice). A fraction of a fraction of a percentage of people care, and even those are easily dissuaded from taking real action by the literal trillion dollar companies who are incentivized to keep the status quo in place.
As a sidebar -- the engineers you're thinking of probably still import libraries written by anonymous internet people, and regardless of how skilled they are, they can't feasible review every line of code of their entire stack (and every update thereof). Thus arises the issue of supply chain attacks. So you were already fighting a losing battle on that front.
Our company (federal contractor) has all AI programs/websites blocked on the site network, and has its own ChatGPT clone (that I’m pretty sure is just reskinned ChatGPT).
That would be the low-hanging fruit for your company. Make it so that all the data people are using to generate AI is at least staying on servers you control. Beyond that, require all code go through human review prior to implementation, but more importantly (and echoing what someone else said), determine why people are trying to code their own software. If it’s because the responsible orgs take two years to create a simple program, then yeah it isn’t surprising that other orgs will try to run around y’all.
I think a lot of companies have strict policies to protect private data. We can't use most LLMs because many interfaces harvest data from the inputs, and it'd be extremely bad if they got users' private data from an employee uploading
users_financial_data.csv
for parsing or analysis.The other major risk with using the LLM itself as the code or authoritative tool is that they're nondeterministic. There's no guarantee they'll have the same output every time. That's fine if I just need 95% accuracy for classifying some data, but it's really really bad if we're talking about strict business decisions like who gets banned for being a high-risk customer. They may fail in new and spectacular ways at unexpected times. As a small example, I absolutely wouldn't trust them for math because they're not calculators.
To your original question, I think there are two follow-up questions:
These are excellent questions.
I think the answer to this question may be the most signifigant thing to happen to society in a long time.
I think everyone ive met at this point has admitted to putting trade secrets and confidential info into chat gpt. I wonder what open ai will do with this data when the bonfire of investment money dries up ? I imagine open ai will look more like macafee in 10 years than an AI company. (Aka rent seeking from the problem they caused)
Aside from that. AI is imo really starting to drop the bottom from do nothing corporate administration. Millions of high paid people that all of us deal with on a daily basis have been offloading the burden of their job to expoited underlings. Now that those underlings have a release valve, and make no mistake they are using it broadly. What will happen when no one is at watch anymore? Well i can tell you, its called collapse. How that looks is to be seen or if anti AI laws get drafted to stop it.
I’m not sure if you consider online responses as meeting someone, but fwiw, I’ve never put trade secrets or confidential info into an LLM.
I'm also a second person! I have the "use my data to improve the experience for others" turned off (I'm sure it doesn't actually do anything but meh). I also don't use it on my work computer, when I need to use it I remain somewhat vague. I access it all in a sandboxed browser tab in Firefox on a personal machine. In the event I have to copy/paste a code segment, I obfuscate any references to internal names and anything relating to the company. I operate it under a "need to know" basis and really no different than if I were to ask the same questions on StackOverflow.
I do know of colleagues who absolutely have pasted confidential emails and documents into chatgpt though...
Ah -- if you're using these tools for work, note that there are additional licensing issues that can become problematic regardless of whether the code is obfuscated. Even using Stack Overflow can be problematic: technically everything there is licensed Creative Commons Share Alike, so copy/pasting answers without attribution can be considered IP infringement! It depends on the code, though, as IIRC anything that is insufficiently creative fails to meet the criteria for copyrightability.
Also, if you are OK with slightly worse answers, local LLMs (with favourable licenses) can likely give you reasonable answers without any concern of leaking data to a third party. Could be an option depending on your circumstances.
Ah yeah I never copy paste directly from the prompt, I use it more like using a calculator to check my math answers and occasionally for consultation and troubleshooting. We do have licensing with co-pilot but that hasn't been rolled out to everybody just yet. Also my code technically isn't customer facing - mostly internal automation tools and infrastructure as code stuff. Not sure if legally there is a distinction there.
My work laptop does have a 6GB (?) Nvidia A1000 GPU so I'm hoping I can get permission to run a local LLM off it. Barring that, I've been meaning to take another crack at getting GPU passthru working in proxmox on my homelab...
I'm not sure this is the perspective you are going for, but in my head it depends on the level of programming. If we are talking about VBA or local python/powershell scripts that a small part of the company uses, I find these to be pretty low risk. We handle a lot of sensitive data, and we are very careful with how we use it, however if someone can use one of these local programs to make something work..that's great. I'd say most folks in our company are not comfortable doing this still, they still come to a BA or programmer to ask their opinion. This is how we have a bit of a peer review with the business area. We aren't saying no, but there is a well known "check with the dev team before making this productionalized" approach. Is it perfect? Absolutely not but we are working with it.
If it’s high risk, I don’t see a good way forward other than to disallow using it in production and getting someone who knows what they’re doing to rewrite it, using the one they built as a prototype / example.
Between having a working version and the AI tools now available, maybe it wouldn’t take too long?