this post was submitted on 23 Aug 2023

5 points (100.0% liked)

Technology

38223 readers

438 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 6 years ago

MODERATORS

[email protected]

[DISCUSS] IBM using LLMs to convert COBOL to Java (techcrunch.com)

submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

51 comments fedilink hide all child comments

It's not the 1st time a language/tool will be lost to the annals of the job market, eg VB6 or FoxPro. Though previously all such cases used to happen gradually, giving most people enough time to adapt to the changes.

I wonder what's it going to be like this time now that the machine, w/ the help of humans of course, can accomplish an otherwise multi-month risky corporate project much faster? What happens to all those COBOL developer jobs?

Pray share your thoughts, esp if you're a COBOL professional and have more context around the implication of this announcement 🙏

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 2 points 2 years ago (2 children)

Why Java instead of C# or Go though?

[–] [email protected] 1 points 2 years ago (1 children)

Because IBM doesn't want to tie themselves to Google or Microsoft. They already have their own builds of OpenJDK.

[–] [email protected] 1 points 2 years ago

No. Because IBM sells WebSphere, so java it is so they can up sell you for more contract labor.

[–] [email protected] 1 points 2 years ago (1 children)

Because Cobol is mainly used in an enterprise environment, where they most likely already run Java software which interfaces with the old Cobol software. Plus modern Java is a pretty good language, it's not 2005 anymore.

[–] [email protected] 0 points 2 years ago

Java is a POS and that’s before log4j.

[–] [email protected] 1 points 2 years ago

Without a requirements doc stamped in metal you won’t get 1:1 feature replication

This was kind of a joke but it’s actually very real tbh, the problems that companies have with human devs trying to bring ancient systems into the modern world will all be replicated here. The PM won’t stop trying to add features just because the team doing it is using an LLM, and the team doing it won’t be the team that built it, so they won’t get all the nuances and intricacies right. So you get a strictly worse product, but it’s cheaper (maybe) so it has to balance out against the cost of the loss in quality

[–] [email protected] 1 points 2 years ago

Oh FFS there is nothing magical about COBOL like its some kind of sword in the stone which only a chosen few can draw. COBOL is simple(-ish), COBOL is verbose. That's why there is so much of it.

The reason you don't see new developers flocking to these mythical high-paying COBOL jobs is its not about the language, but rather about maintaining these gianourmous, mission-critical applications that are basically black boxes due to the loss of institutional knowledge. Very high risk with almost no tangible, immediate reward--so don't touch it. Not something you can just throw a new developer at and hope for the best, the only person who knew this stuff was some guy named "John", and he retired 15 years ago! Etc, etc.

Also this is IBM were talking about, so purely buzzword-driven development. IBM isn't exactly known for pushing the envelope recently. Plus transpilers have existed as a concept since... Forever basically? Doubt anything more will come from this other than upselling existing IBM contracts who are already replacing COBOL.

[–] [email protected] 1 points 2 years ago

I have my doubts that this works well, every LLM we've seen that translates/writes code often makes mistakes and outputs garbage.

[–] [email protected] 1 points 2 years ago

This sounds no different than the static analysis tools we’ve had for COBOL for some time now.

The problem isn’t a conversion of what may or may not be complex code, it’s taking the time to prove out a new solution.

I can take any old service program on one of our IBM i machines and convert it out to Java no problem. The issue arises if some other subsystem that relies on that gets stalled out because the activation group is transient and spin up of the JVM is the stalling part.

Now suddenly, I need named activation and that means I need to take lifetimes into account. Static values are now suddenly living between requests when procedures don’t initial them. And all of that is a great way to start leaking data all over the place. And when you suddenly start putting other people’s phone numbers on 15 year contracts that have serious legal ramifications, legal doesn’t tend to like that.

It isn’t just enough to convert COBOL 1:1 to Java. You have to have an understanding of what the program is trying to get done. And just looking at the code isn’t going to make that obvious. Another example, this module locks a data area down because we need this other module to hit an error condition. The restart condition for the module reloads it into a different mode that’s appropriate for the process which sends a message to the guest module to unlock the data area.

Yes, I shit you not. There is a program out there doing critical work where the expected execution path is to on purpose cause an error so that some part of code in the recovery gets ran. How many of you think an AI is going to pick up that context?

The tools back then were limited and so programmers did all kinds of hacky things to get particular things done. We’ve got tools now to fix that, just that so much has already been layered on top of the way things work right now. Pair with the whole, we cannot buy a second machine to build a new system and any new program must work 99.999% right out of the gate.

COBOL is just a language, it’s not the biggest problem. The biggest problem is the expectation. These systems run absolutely critical functions that just simply cannot fail. Trying to foray into Java or whatever language means we have to build a system that doesn’t have 45 years worth of testing that runs perfectly. It’s just not a realistic expectation.

[–] [email protected] 1 points 2 years ago

"all those COBOL developer jobs" nowadays probably fit in one bus. That's why every company that can afford it moves away from COBOL.

[–] [email protected] 1 points 2 years ago* (last edited 2 years ago) (4 children)

according to a 2022 survey, there’s over 800 billion lines of COBOL in use on production systems, up from an estimated 220 billion in 2017

That doesn't sound right at all. How could the amount of COBOL code in use quadruple at a time when everyone is trying to phase it out?

[–] [email protected] 1 points 2 years ago (1 children)

The 2022 survey accounted for code that the 2017 survey missed?

[–] [email protected] 1 points 2 years ago

I think it's more likely that one survey or the other (or both) are simply nonsense.

[–] [email protected] 1 points 2 years ago

It could mean anything, the same code used in production in new ways, slightly modified code, newly discovered cobol where the original language was a mystery, new requirements for old systems, seriously it could be too many things for that to be a useful metric with no context

[–] [email protected] 1 points 2 years ago

Maybe some production systems were replicated at some point and they're adding those as unique lines?

[–] [email protected] 0 points 2 years ago (1 children)

Because it’s not actually getting phased out in reality

[–] [email protected] 1 points 2 years ago

But it isn't getting quadrupled either, at least because there aren't enough COBOL programmers in the world to write that much new code that quickly.

[–] [email protected] 0 points 2 years ago (1 children)

Not a cobol professional but i know companies that have tried (and failed) to migrate from cobol to java because of the enormously high stakes involved (usually financial).

LLMs can speed up the process, but ultimately nobody is going to just say "yes, let's accept all suggested changes the LLM makes". The risk appetite of companies won't change because of LLMs.

[–] [email protected] 0 points 2 years ago (1 children)

Wonder what makes it so difficult. "Cobol to Java" doesn't sound like an impossible task since transpilers exist. Maybe they can't get similar performance characteristics in the auto-transpiled code?

[–] [email protected] 0 points 2 years ago (1 children)

Translating it isn't the difficult part. It's convincing a board room full of billionaires that they should flip the switch and risk having their entire system go down for a day because somebody missed a bug in the code and then having to explain to some combination of very angry other billionaires and very angry financial regulators why they broke the economy for the day.

[–] [email protected] 0 points 2 years ago (2 children)

Well, I'd rather the day be sooner than later. Also, this is why you have... Backup servers and development environments. You don't just flick the Switch randomly one day after code is made. You run months and months of simulated transactions on the new code until you get an adequate amount of bugs fixed.

There will come a time when these old COBOL machines will just straight die, and they can't be assed to keep making new hardware for them. And the programmers will all die out too. And then your shit out of luck. I'd rather the last few remaining COBOL programmers help translate to some other long lasting language before they all kick the bucket and not after.

[–] [email protected] 1 points 2 years ago

Nah, dump it all.

COBOL programs don’t handle utf8 and other modern things like truly variable length strings.

Best thing to do is refactor and periodically test by turning off the mainframe system to see what breaks. Why something was done is lost to the sands of time at this point.

[–] [email protected] 0 points 2 years ago (1 children)

Well, I'd rather the day be sooner than later.

Agreed, but we're not the ones making the decision. And the people who are have two options: move forward with a risky, expensive, and potentially career-ending move with no benefits other than the system being a little more maintainable, or continuing on with business-as-usual and earning massive sums of money they can use to buy a bigger yacht next year. It's a pretty obvious decision, and the consequences will probably fall on whoever takes over after they move on or retire, so who cares about the long term consequences?

You run months and months of simulated transactions on the new code until you get an adequate amount of bugs fixed.

The stakes in financial services is so much higher than typical software. If some API has 0.01% downtime or errors, nobody gives a shit. If your bank drops 1 out of every 1000 transactions, people lose their life savings. Even the most stringent of testing and staging environments don't guarantee the level of accuracy required without truly monstrous sums of money being thrown at it, which leads us back to my point above about risk vs yachts.

There will come a time when these old COBOL machines will just straight die, and they can't be assed to keep making new hardware for them.

Contrary to popular belief, most mainframes are pretty new machines. IBM is basically afloat purely because giant banks and government institutions would rather just shell out a few hundred thousand every few years for a new, better Z-frame than going through the nightmare that is a migration.

If you're starting to think "wow, this system is doomed to collapse under its own weight and the people in charge are actively incentivized to not do anything about it," then you're paying attention and should probably start extending that thought process to everything else around you on a daily basis.

[–] [email protected] 1 points 2 years ago

When you say it like that... God bless America this whole system is doomed.

[–] [email protected] 0 points 2 years ago (1 children)

That's alot of effort to go from one horrible programming language to another horrible programming language.

[–] [email protected] 1 points 2 years ago

Yes. Leave it to IBM to take a terrible idea and make it worse.

[–] [email protected] 0 points 2 years ago (4 children)

Converting ancient code to a more modern language seems like a great use for AI, in all honesty. Not a lot of COBOL devs out there but once it's Java the amount of coders available to fix/improve whatever ChatGPT spits out jumps exponentially!

[–] [email protected] 2 points 2 years ago (1 children)

The fact that you say that tells me that you don’t know very much about software engineering. This whole thing is a terrible idea, and has the potential to introduce tons of incredibly subtle bugs and security flaws. ML + LLM is not ready to be used for stuff like this at the moment in anything outside of an experimental context. Engineers are generally - and with very good reason - deeply wary of “too much magic” and this stuff falls squarely into that category.

[–] [email protected] 0 points 2 years ago (3 children)

All of that is mentioned in the article. Given how much it cost last time a company tried to convert from COBOL, don't be surprised when you see more businesses opt for this cheaper path. Even if it only converts half of the codebase, that's still a huge improvement.

Doing this manually is a tall order...

[–] [email protected] 1 points 2 years ago

And doing it manually is probably cheaper in the long run, especially considering that COBOL tends to power some very mission critical tasks, like financial systems.

The process should be:

set up a way to have part of your codebase in your new language
write tests for the code you're about to port
port the code
go to 2 until it's done

If you already have a robust test suite, step 2 becomes much easier.

We're doing this process on a simpler task of going from Flow (JavaScript with types) to TypeScript, but I did a larger transition from JavaScript to Go and Ruby to Python using the same strategy and I've seen lots of success stories with other changes (e.g. C to Rust).

If AI is involved, I would personally use it only for step 2 because writing tests is tedious and usually pretty easy to review. However, I would never use it for both step 2 and 3 because of the risk of introducing subtle bugs. LLMs don't understand the code, they merely spot patterns and that's absolutely not what you want.

[–] [email protected] 1 points 2 years ago* (last edited 2 years ago) (2 children)

Yeah, I read the article.

They’re MASSIVELY handwaving a lot of detail away. Moreover, they’re taking the “we’ll fix it in post” approach by suggesting “we can just run an armful of security analysis software on the code after the system spits something out”. While that’s a great sentiment, you (and everyone considering this approach) needs to consider that complex systems are pretty much NEVER perfect. There WILL be misses. Add this to the fact that a ton of organizations that still use COBOL are banks - which are generally considered fairly critical to the day-to-day operation of our society, and you can see why I am incredibly skeptical of this whole line of thinking.

I’m sure the IBM engineers who made the thing are extremely good at what they do, but at the same time, I have a lot less faith in the organizations that will actually employ the system. In fact, I wouldn’t be terribly shocked to find that banks would assign an inappropriately junior engineer to the task - perhaps even an intern - because “it’s as simple as invoking a processing pipeline”. This puts a truly hilarious amount of trust into what’s effectively a black box.

Additionally, for a good engineer, learning any given programming language isn’t actually that hard. And if these transition efforts are done in what I would consider to be the right way, you’d also have a team of engineers who know both the input and output languages such that they can go over (at the very, very least) critical and logically complex areas of the code to ensure accuracy. But since this is all about saving money, I’d bet that step simply won’t be done.

[–] [email protected] 2 points 2 years ago (1 children)

For those who have never worked on legacy systems. Any one who suggests “we’ll fix it in post” is asking you to do something that just CANNOT happen.

The systems I code for, if something breaks, we’re going to court over it. Not, oh no let’s patch it real quick, it’s your ass is going to be cross examined on why the eff your system just wrote thousands of legal contracts that cannot be upheld as valid.

Yeah, that fix it in post shit any article, especially this one that’s linked, suggests should be considered trash that has no remote idea how deep in shit one can be if you start getting wild hairs up your ass for changing out parts of a critical system.

[–] [email protected] 1 points 2 years ago

And that’s precisely the point I’m making. The systems we’re talking about here are almost exclusively banking systems. If you don’t think there will be so Fucking Huge Lawsuits over any and all serious bugs introduced by this - and there will be bugs introduced by this - you straight up do not understand what it’s like to develop software for mission-critical applications.

[–] [email protected] 1 points 2 years ago

Trusting IBM engineers, perhaps…sales/marketing? Oooh now I am skeptical.

[–] [email protected] 1 points 2 years ago (2 children)

Even if it only converts half of the codebase, that’s still a huge improvement.

The problem is it'll convert 100% of the code base but (you hope) 50% of it will actually be correct. Which 50%? That's left as an exercise to the reader. There's no human, no plan, no logic necessarily to how it was converted also so it can be very difficult to understand code like that and you can't ask the person who wrote why stuff is a certain way.

Understanding large, complex codebases one didn't write is a difficult task even under pretty ideal conditions.

[–] [email protected] 2 points 2 years ago* (last edited 2 years ago)

First, odds are only half the code is used, and in that half, 20% has bugs that the system design obscures. It’s that 20% that tends to take the lionshare of modernization effort.

It wasn’t a bug then, though it was there, but it is a bug now.

[–] [email protected] 0 points 2 years ago (1 children)

The problem is it’ll convert 100% of the code base

Please go read the article. They specifically say they aren't doing this.

[–] [email protected] 1 points 2 years ago (1 children)

I was speaking generally. In other words, the LLM will convert 100% of what you tell it to but only part of the result will be correct. That's the problem.

[–] [email protected] 0 points 2 years ago (1 children)

And in this case they're not doing that:

“IBM built the Code Assistant for IBM Z to be able to mix and match COBOL and Java services,” Puri said. “If the ‘understand’ and ‘refactor’ capabilities of the system recommend that a given sub-service of the application needs to stay in COBOL, it’ll be kept that way, and the other sub-services will be transformed into Java.”

So you might feed it your COBOL code and find it only coverts 40%.

[–] [email protected] 1 points 2 years ago (1 children)

So you might feed it your COBOL code and find it only coverts 40%.

I'm afraid you're completely missing my point.

The system gives you a recommendation: that has a 50% chance of being correct.

Let's say the system recommends converting 40% of the code base.

The system converts 40% of the code base. 50% of the converted result is correct.

50% is a random number picked out of thin air. The point is that what you end up with has a good chance of being incorrect and all the problems I mentioned originally apply.

[–] [email protected] 0 points 2 years ago (2 children)

One would hope that IBM's selling a product that has a higher success rate than a coinflip, but the real question is long-term project cost. Given the example of a $700 million dollar project, how much does AI need to convert successfully before it pays for itself? If we end up with 20% of the original project successfully done by AI, that's massive savings.

The software's only going to get better, and in spite of how lucrative a COBOL career is, we don't exactly see a sharp increase in COBOL devs coming out of schools. We either start coming up with viable ways to move on from this language or we admit it's too essential to ever be forgotten and mandate every CompSci student learn it before graduating.

[–] [email protected] 1 points 2 years ago

A random outcome isn't flipping a coin, it's rolling dice

[–] [email protected] 1 points 2 years ago

One would hope that IBM’s selling a product that has a higher success rate than a coinflip

Again, my point really doesn't have anything to do with specific percentages. The point is that if some percentage of it is broken you aren't going to know exactly which parts. Sure, some problems might be obvious but some might be very rare edge cases.

If 99% of my program works, the remaining 1% might be enough to not only make the program useless but actively harmful.

Evaluating which parts are broken is also not easy. I mean, if there was already someone who understood the whole system intimately and was an expert then you wouldn't really need to rely on AI to port it.

Anyway, I'm not saying it's impossible, or necessary not going to be worth it. Just that it is not an easy thing to make successful as an overall benefit. Also, issues like "some 1 in 100,000 edge case didn't get handle successfully" are very hard to quantify since you don't really know about those problems in advance, they aren't apparent, the effects can be subtle and occur much later.

Kind of like burning petroleum. Free energy, sounds great! Just as long as you don't count all side effects of extracting, refining and burning it.

[–] [email protected] 1 points 2 years ago

This is what in thinking. Even the few people I know IRL that know COBOL from their starting days say it's a giant pain in the ass as a language. It's not like it's really gonna cost all that much time to do compared to paying labor to rewrite it from the base, even if they don't end up using it. Sure, correcting bad code can take a lot of time to do manually. But important code being in COBOL is a ticking time bomb, they gotta do something.

[–] [email protected] 1 points 2 years ago (1 children)

I'm more alarmed at the conversation in this thread about migrating these cobol apps to java. Maybe I am the one who is out of touch, but what the actual fuck? Is it just because of the large java hiring pool? If you are effectively starting from scratch why in the ever loving fuck would you pick java?

[–] [email protected] 1 points 2 years ago

Java is the new cobol, all the enterprises love it.

[–] [email protected] 1 points 2 years ago

Is ChatGPT magic to people? ChatGPT should never be used in this way because the potential of critical errors is astronomically high. IBM doesn't know what it's doing.

load more comments