Jump to content

Updating from USSEP 4.2.2 to 4.2.4b causes new RetroScripts() to run everytime you load a save


SenorCorruptedSave

Recommended Posts

I encountered a save corrupting bug. And after many hours of isolating it I finally figured out what is causing this, and I can easily reproduce it on a new vanilla save with only one mod installed: USSEP. Here are the steps to reproduce.

  1. Click New Game, wait for intro to finish and you have your hands freed. This pushes us well past stage 70 of MQ101.
  2. Install USSEP version 4.2.2. Wait for it to finish running the retroactive scripts, which can take quite a while due to them purposefully calling wait(). Checked my Papyrus.0.log to ensure they had all finished.
  3. Save, Exit, and update to USSEP 4.2.4b.
  4. Now, every single time you load this save, and a subsequent save, you will see the following two entries in your Papyrus.0.log:
[05/24/2021 - 05:14:09PM] VM is thawing...
[05/24/2021 - 05:14:09PM] USSEP 4.2.3 Retroactive Updates Complete
[05/24/2021 - 05:14:09PM] USSEP 4.2.4 Retroactive Updates Complete

Every time. As Arthmoor recently said these scripts should only run once, but they clearly aren't. I figured out why. In USSEP_VersionTrackingScript we see the following code.

	;Previous USLEEP script was handling tracking for both, bring over its version value if it's greater than 307, which was the last USLEEP retro script version.
	;Some of the stuff in the USSEP retro scripts should not be run twice.
	if( USLEEPVersionTracking.LastVersion > 307 )
		LastVersion = USLEEPVersionTracking.LastVersion
	endif

	if( LastVersion < 424 )
		if( LastVersion < 406 )
			Retro406.Process()
		Elseif( LastVersion < 407 )
			Retro407.Process()
		Elseif( LastVersion < 408 )
			Retro408.Process()
		Elseif( LastVersion < 413 )
			Retro413.Process()
		Elseif( LastVersion < 414 )
			Retro414.Process()
		Elseif( LastVersion < 415 )
			Retro415.Process()
		Elseif( LastVersion < 417 )
			Retro417.Process()
		Elseif( LastVersion < 420 )
			Retro420.Process()
		Elseif( LastVersion < 421 )
			Retro421.Process()
		Elseif( LastVersion < 423 )
			Retro423.Process()
		Elseif( LastVersion < 424 )
			Retro424.Process()
		EndIf
	EndIf

This script is flagged to run on OnPlayerLoadGame(), which fires everytime the player loads a save. Thus everytime it loads the game, the LastVersion gets reset to the stale USLEEPVersionTracking.LastVersion, which in the case of an update from 4.2.2 this var will equal 421. Which means as the code continues it will satisfy the conditions to trigger Retro423.Process().

Which brings us to the save corrupting symptom of this simple bug. It's quite delayed. You have to save and load a vanilla game 59 times to have your save corrupted. What is happening is that Retro423.Process() was only intended to be run once, and it adds 4 entries to a Level Item. Run it 59 times though, and it ends up adding 236 entries! That's on top of vanilla's 20 entries, so 20+236 = 256. However, Leveled Items can only have 255 entries. If you add more than this via scripts, the save is corrupted. How this manifests itself is that when the game engine attempts to create a new save, you get an instant CTD. The same exception code you get from circular leveled lists, which also cause a CTD on save.

Is this a bug I should report in the tracker? Or is it perhaps possible that USSEP does not support updating from one version to the next? I attempted to report this earlier, but Arthmoor marked it "Invalid". I probably did not do a good enough job of describing things, it seems like Arthmoor thought this had something to do with ReSaver. It doesn't. It can even be reproduced in a brand new save with only one mod, USSEP. Since this issue was already closed (And props on the fast triage! I wish our team was that quick.) I figured I should open a topic and discuss it first. If folks, especially if Arthmoor, agreed it's a bug that merits fixing, then I'd open a new issue with a better description. Plus my better understanding of the problem, after working a few more hours to further isolate and reproduce.

If you guys need my full Papyrus Logs or anything let me know, I purposefully made copies every step of the way. Thanks for a great mod!

Link to comment
Share on other sites

I can also corroborate a CTD-on-save tied to this issue, which in my play-through could only be corrected using ReSaver (without starting over entirely). To be very clear, in my playthrough I followed recommended procedures including not uninstalling mods mid-playthrough, only updating ones confirmed safe by the authors, and not using ReSaver at any point beforehand except to cure this particular CTD. I began my play-through with a save before leaving Helgen with USSEP 4.2.1 installed - and this appears to be the crux of the problem. The post here has a relevant .NET crash log, which lists the TESLevItem record for bandit arrows as relevant.

EDIT:

Took about ten minutes to verify the reproduction steps work. I load in with an up-to-date USSEP and a save-baked version from about a year ago, I save, and I exit. Repeat. I am seeing the above logs each time.

Papyrus logs here (shady I know):

https://easyupload.io/m/s1ok1z

Link to comment
Share on other sites

Right, so initially I had just reposted the closing comment for the bug explaining how the setup works. Then when I realized the issue I deleted my post (which someone had replied to so I removed it too).

So yeah, bottom line, the check for the old USLEEP version number needs an additional condition to keep THAT from resetting. So basically it'll become this now:

	;Previous USLEEP script was handling tracking for both, bring over its version value if it's greater than 307, which was the last USLEEP retro script version.
	;Some of the stuff in the USSEP retro scripts should not be run twice.
	if( USLEEPVersionTracking.LastVersion > 307 && PreviousVersionsCheck == false )
		LastVersion = USLEEPVersionTracking.LastVersion
		PreviousVersionsCheck = true
	endif

That will ensure it picks up the right previous version and only changes the variable once.

Now, as for why I brought up Resaver. Tools like that, Fallrim Tools, and any other tool meant to edit the contents of a save are dangerous. They WILL corrupt property data on scripts. Save files are NOT properly decoded and nobody has any idea what kind of interaction the decoded parts has with the undecoded parts. Any manipulation of this data is a guarantee you're going to have problems even if it doesn't look like it initially.

Link to comment
Share on other sites

That's an odd bit of coding.

For one thing, there are two tests for

2 hours ago, SenorCorruptedSave said:

( LastVersion < 424 )

Are those supposed to be ElseIf?

Running version USSEP 400, upgrading to 424. Doesn't this require running every Process() in the list? Why is it better to have 406 Process() call 407 Process() call 408 Process() et cetera?

Link to comment
Share on other sites

They're called the way they are to force Papyrus into calling them in sequence. Dumping them all into the main process checker would allow it to process them out of order which is something we don't want to happen.

Link to comment
Share on other sites

I've updated our current USSEP beta to include a fix to address this if you guys want to give it a shot on any affected saves.

Link to comment
Share on other sites

9 hours ago, Arthmoor said:

Dumping them all into the main process checker would allow it to process them out of order which is something we don't want to happen.

If Papyrus allows out of order parallel processing within a single function, then a lot of scripts aren't going to work. Could you cite an example?

Calling the same function 3 different ways isn't a good programming practice. Invoking the serial sequence of functions as subfunctions just builds up the stack, another poor programming practice. All those redundant stage checks aren't good practice either.

If it is required to make Papyrus work, then it needs to be better documented.

Link to comment
Share on other sites

I appreciate ya giving it a second shake. The fix looks great.

I would note it introduces a tiny incompatibility with other mods that also added to this item via scripts. Notably the Exotic Arrows CC does this. The incompatibility is tiny, in that most folks will probably never notice it.

I wonder though, would it make sense to check if you need to run the Revert()? For brand new installs that never updated, their leveled list is pristine (in theory), and the Revert() can only remove entries from other mods. I'd speculate that by having a USLEEP LastVersion > 400, you can reliably tell whether or not the person has updated or it's a new install.

The Revert() actually fixes another bug, one that was so small I didn't even want to report it. In previous versions like 4.2.2, the Bandit Arrows were added inside of UDBP_Retroactive212Script.
 

Scriptname UDBP_Retroactive212Script extends Quest  

Quest Property MQ101 Auto
UDBP_VersionTrackingScript Property UDBPTracking Auto

Quest Property DLC2MQ06 Auto

Quest Property DLC2SkaalVillageFreeform2 Auto
ReferenceAlias Property Edla Auto

Quest Property DLC2RR03Intro Auto

LeveledItem Property LItemBanditWeaponArrows Auto
Ammo Property DLC2NordicArrow Auto

Function Process()
    ;Bug #19338 - Nordic arrows never added to bandit arrows list. DLC2Init does 4 of these for bows, so we'll do 4 for arrows too.
    LItemBanditWeaponArrows.AddForm( DLC2NordicArrow, 24, 1 )
    LItemBanditWeaponArrows.AddForm( DLC2NordicArrow, 25, 1 )
    LItemBanditWeaponArrows.AddForm( DLC2NordicArrow, 26, 1 )
    LItemBanditWeaponArrows.AddForm( DLC2NordicArrow, 27, 1 )

This Logic was moved to USSEPRetroactive423Script. Thus for anyone updating, they would have already had the bandit arrows added by the old UDBP scripts, and would have them added again by the new 423 one. Thus, 423 always creates at least one duplicate during updates. But by calling Revert() afterwards, you fixed it.

Link to comment
Share on other sites

One other thought, if you do wrap the Revert() in a conditional that only fires for updates, would it make sense to do a popup notifying people of the potential incompatibility the Revert() causes? I don't know if you guys ever do that, nor how common it is to introduce small incompatibilities out of necessity. But, if this is rare a popup could be welcome.

I am tempted to write a custom SKSE script, one that outputs which non-duplicate entries were deleted and allows one to reinstate them. This would be for my own use, but I'd happily share it in the forums or somewhere, for other SKSE users. I completely agree about ReSaver being dangerous, which is why I wanted to get to the bottom of this rather than using it. This script should be a decent alternative, assuming you have a save from just before the CTDs started.

Link to comment
Share on other sites

I can add something to the changelog to make note of the reversion but I'd rather not have the thing throwing popups at people for it.

Yes, I can add something to stop it from running on a fresh save. It's usually the case that you can't get to the save menu before stage 70 of MQ101 so I'll just recycle the same check for this.

Link to comment
Share on other sites

16 hours ago, Arthmoor said:

Right, so initially I had just reposted the closing comment for the bug explaining how the setup works. Then when I realized the issue I deleted my post (which someone had replied to so I removed it too).

So yeah, bottom line, the check for the old USLEEP version number needs an additional condition to keep THAT from resetting. So basically it'll become this now:


	;Previous USLEEP script was handling tracking for both, bring over its version value if it's greater than 307, which was the last USLEEP retro script version.
	;Some of the stuff in the USSEP retro scripts should not be run twice.
	if( USLEEPVersionTracking.LastVersion > 307 && PreviousVersionsCheck == false )
		LastVersion = USLEEPVersionTracking.LastVersion
		PreviousVersionsCheck = true
	endif

That will ensure it picks up the right previous version and only changes the variable once.

Now, as for why I brought up Resaver. Tools like that, Fallrim Tools, and any other tool meant to edit the contents of a save are dangerous. They WILL corrupt property data on scripts. Save files are NOT properly decoded and nobody has any idea what kind of interaction the decoded parts has with the undecoded parts. Any manipulation of this data is a guarantee you're going to have problems even if it doesn't look like it initially.

Just curious and I could be wrong about this.

Wasn't that USLEEP version check meant for people who used the old separate USKP patches e.g USKP, UDGP, UHFP, UDBP and wanted to upgrade to USLEEP?

If so then I no reason in keeping that piece of code in USSEP.

Link to comment
Share on other sites

5 hours ago, Arthmoor said:

I can add something to the changelog to make note of the reversion but I'd rather not have the thing throwing popups at people for it.

Yes, I can add something to stop it from running on a fresh save. It's usually the case that you can't get to the save menu before stage 70 of MQ101 so I'll just recycle the same check for this.

Yeah that makes sense. Cheers!

4 hours ago, Leonardo said:

Just curious and I could be wrong about this.

Wasn't that USLEEP version check meant for people who used the old separate USKP patches e.g USKP, UDGP, UHFP, UDBP and wanted to upgrade to USLEEP?

If so then I no reason in keeping that piece of code in USSEP.

Looks like it is still used to me. Internally, you can see that the code, quest names, etc. All have a history dating back to the separate patches. That USLEEP version actually was the master version until at least version 4.2.2 of USSEP. Since then the separate USSEP version tracker was added. However, there are still version tracker for each of the 4 originals, plus LE, and now the new SSE one. SSE checks in with all of them before doing its thing, as they all still track their internal versions.

I'd characterize the design as similar to a handful of Lego blocks. It looks like it was relatively easy to slap an LE one overtop the 4 originals, then an SSE one over top that. The originals done seem like they need to be touched ever again. But they do get included, and do the same bugfixes they always did. So during an update the new SSE block needs to check and make sure they and LE finished, before adding itself on top.

Link to comment
Share on other sites

OK, so the updated beta now has a check included so that the revert call won't be done on a fresh save (or any 1st time run of USSEP) but will still call that revert for someone updating from an older version.

You pretty much have it as far as how the scripts were built up over time and stack on top of each other. Part of the reason this leveled list issue ended up a problem is because USLEEP and USSEP work overlapped for awhile and there wasn't a clean enough break made back in the day.

I've also gone back over every single one of these scripts now to make sure no more nasty leveled list issues are lurking - they aren't. Everything else I thought I was remembering were formlists and those have protective HasForm() checks on them already. Too bad the LeveledItem base script didn't support that. It would have avoided a lot of messes.

You'll also notice now that each of the retro scripts will announce themselves in the Papyrus log even on a brand new game. They didn't used to, and it would probably have been helpful if they had. :P

Link to comment
Share on other sites

10 hours ago, DayDreamer said:

If Papyrus allows out of order parallel processing within a single function, then a lot of scripts aren't going to work. Could you cite an example?

Calling the same function 3 different ways isn't a good programming practice. Invoking the serial sequence of functions as subfunctions just builds up the stack, another poor programming practice. All those redundant stage checks aren't good practice either.

If it is required to make Papyrus work, then it needs to be better documented.

Missed this before, but you're looking at the example to cite already. The original setup for these scripts was to call them all in sequence from the version tracking script. We found that simply putting them all there to be called one by one allowed the VM to call the functions out of order. The logs would bear that out by seeing the debug traces being in the wrong sequences. There was an old post on the old Jive forum at bethesda.net (which is long gone now) where SMKViper explained that it's part of how it enabled multithreading and he confirmed that how we initially planned this system out wouldn't have worked and that the refactoring into what you see now was a good call back in the day.

So now when one retro script finishes, it calls the next one in the sequence, and so on until they're all done. The logs bear out that they will always work this way now. So there's absolutely no incentive to change it.

Link to comment
Share on other sites

6 hours ago, Arthmoor said:

Missed this before, but you're looking at the example to cite already. The original setup for these scripts was to call them all in sequence from the version tracking script. We found that simply putting them all there to be called one by one allowed the VM to call the functions out of order. The logs would bear that out by seeing the debug traces being in the wrong sequences. There was an old post on the old Jive forum at bethesda.net (which is long gone now) where SMKViper explained that it's part of how it enabled multithreading and he confirmed that how we initially planned this system out wouldn't have worked and that the refactoring into what you see now was a good call back in the day.

So now when one retro script finishes, it calls the next one in the sequence, and so on until they're all done. The logs bear out that they will always work this way now. So there's absolutely no incentive to change it.

That's interesting. I'd speculate that doesn't apply functions with returns values. How would you even be able to program if it did? But, if the function doesn't have a value that gets utilized locally, I could see how you could generally split it off and few would notice.

Link to comment
Share on other sites

On 5/25/2021 at 7:54 PM, Arthmoor said:

here was an old post on the old Jive forum at bethesda.net (which is long gone now) where SMKViper explained that it's part of how it enabled multithreading and he confirmed that how we initially planned this system out wouldn't have worked and that the refactoring into what you see now was a good call back in the day.

Thanks, this is important to know. I'm wondering how many other places are affected by this "multithreading"....

It does make it very hard to follow, and violates the usual teaching on structured programming and object methods.

No wonder that we had such problems getting the weapon rack scripts to work. Running two parallel scripts was bad enough. If they could in turn be split into threads without any thread locking or coordination method, what a mess!

Thanks also for printing out each retroscript running. In December, I'd reported the 4.2.4 variable changes in the log; the kind of thing that I've always checked. Never knew to check that a retroscript was running multiple times.

 

On 5/26/2021 at 2:03 AM, SenorCorruptedSave said:

I'd speculate that doesn't apply functions with returns values.

Also, idempotency. Sometimes it seems to me that Bethesda didn't hire anybody with compiler or kernel experience.

Back in the day (mid-1970s), we all took a semester systems programming class (multiprocessing scheduler), and a full year compiler class. As time went by, those became graduate level specializations. Nowadays, my colleagues complain kids just want to learn Rudy on Rails to get a job.

Anyway, want to thank whomever SenorCorruptedSave might be (you really don't need to hide your name around here). Excellent detective work.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...