Mailing List webobjects-dev@wocommunity.org Message #107
From: OCsite <webobjects-dev@wocommunity.org>
Subject: Re: [WO-DEV] ERXObjectStoreCoordinatorSynchronizer woes
Date: Mon, 22 Mar 2021 03:17:33 +0100
To: WebObjects & WOnder Development <webobjects-dev@wocommunity.org>
Aaron,

I can see why Wonder replaces some Apple classes. What I do not see is why, whenever any such replacement happens to be important and its lack may lead to hard-to-find bugs, it does not check and at the very least log out an error, something like “We got an original NSArray instead of our fixed one. Therefore, the Very Nice Foo Bar functionality was switched off, lest we get suprising crashes and exceptions. To fix the problem check your classpath, very probably it contains JavaFoundation before ERExtensions: switch them!” :)

Aside of that, I've found some notes which seem to suggest our classpath order was intentional, for 7-odd years ago there were some problems with RandomAccessSubList; the JavaFoundation before ERX ensured there was no RandomAccessSubList, but just a reliable (albeit perhaps less efficient) NSMutableArray. Alas I can't find in the notes the particular kind of problems which RandomAccessSubList caused, and thus can't simply test if it was fixed meantime or not :(

Thanks and all the best,
OC

On 22 Mar 2021, at 2:37, Aaron Rosenzweig <webobjects-dev@wocommunity.org> wrote:

Hi OC,

It is true that the order of the JARs does matter and it may be causing issues here… but the order definitely matters. 

Take “NSArray” for example. That’s a NeXT/Apple class right? Yes & No. WOnder provides their own version of it. Wonders version has the same name and is in the same package: com.webobjects.foundation 

Why did we, as a community, do that? In the early days it was so that NSArray could use generics… then in WO 5.4.3 they fixed that on the Apple side so… I’m not sure, offhand, why we still have our own version of NSArray but… we do. And there are other classes like this. To make it “work” you Java takes the first package/Class it finds and uses it all the time. First JAR file wins, so the Wonder stuff must come before the Apple stuff. 


On Mar 21, 2021, at 2:54 PM, OCsite <webobjects-dev@wocommunity.org> wrote:

Aaron,

thanks! The app is definitely fully initialised (import is run from a web page; besides, often it happens not at the 1st time, but later, 6th import or so). In a sense, I do not use the stuff, that is, not actively — I just create a new OSC to get a separate EO stack with its own database channel. All the other stuff sort of happens as a result :)

Now though, very preliminarily, it seems the problem might depend on the classpath. What the! Namely, it looks like

- it does happen if JavaFoundation and JavaWebObjects precede the ER stuff;
- so far, it never happened if ERExtensions and ERJars precede JavaFoundation and JavaWebObjects (that, of course, is inconclusive, given the randomness of the issue).

Looks like WOnder overrides some WO functionality, and unless it is first on classpath, it might lead to problems. Very weird, especially that it does not check whether tricks it relies on really happened or not :-O

I wonder if this is really the culprit...

Thanks and all the best,
OC


On 21 Mar 2021, at 17:05, Aaron Rosenzweig <webobjects-dev@wocommunity.org> wrote:

Hi OC, 

Check to be sure your import task is only started after the app has finished loading, not during the launch of the app. If you call too early in the startup phase your ModelGroups, etc, may not be setup yet. That’s what it sort of looks like from your stack trace because you have a null pointer inside of EOModelGroup which is a NeXT/Apple object, not even a WOnder one. 

If you double check and are sure you don’t kick off a concurrent thread before the app has finished loading… then I’m not sure. You may have to revisit your use of the ERX messaging coordinators. I’ve never used them so I don’t have experience to share. From where I stand they sound “cool” but I don’t get the use case. I get that people want “fresh” data and if every edit messages to all the other ObjectStoreCoordinators then everybody is fresh all the time! Cool! but at what cost? Does every instance need to fault in objects that people may never see? If someone is editing the same data, and they get an update from some other thread, what then? who wins? Chatter is expensive on CPU / network too. Seems to me that if people want “fresh” then the best thing is to not sync but to get fresh data on the page that you are at by setting the timestamp lag to something small like 2 seconds. For a statistics page maybe avoid EOF altogether, use a direct fetch of SQL. 

On Mar 21, 2021, at 12:01 AM, OCsite <webobjects-dev@wocommunity.org> wrote:

Hi there,

occasionally (not too often), we are running a background import task, which uses its own EO stack: at launch, it creates a new EOObjectStoreCoordinator (and for it it creates an ERXEC and uses it to import data). When done and saved, the coordinator is disposed and released. The rationale is that the imported data might be big and we don't want to limit normal workers processing to wait until the import saves its results into the database.

For a long long time it worked reliably and without a glitch.

Lately, it often (though by far not each time!) happens that

(i) a save in the background task reports the following exception:

===
04:38:38.600 ERROR java.lang.NullPointerException       //log:er.extensions.eof.ERXObjectStoreCoordinatorSynchronizer [ERXOSCProcessChanges]
NullPointerException
  at com.webobjects.eoaccess.EOModelGroup.modelGroupForObjectStoreCoordinator(EOModelGroup.java:795)
  at er.extensions.eof.ERXEOAccessUtilities.databaseContextForEntityNamed(ERXEOAccessUtilities.java:1086)
  at er.extensions.eof.ERXObjectStoreCoordinatorSynchronizer$ProcessChangesQueue._process(ERXObjectStoreCoordinatorSynchronizer.java:509)
  at er.extensions.eof.ERXObjectStoreCoordinatorSynchronizer$ProcessChangesQueue.process(ERXObjectStoreCoordinatorSynchronizer.java:540)
  at er.extensions.eof.ERXObjectStoreCoordinatorSynchronizer$ProcessChangesQueue.run(ERXObjectStoreCoordinatorSynchronizer.java:617)
  ... skipped 1 stack elements
===

(ii) after that, usually no more exceptions are reported, but the ERXObjectStoreCoordinatorSynchronizer does not seem to work properly anymore, and it often happens that the changes done in the background task are not visible in the main OSC for awhile.

From the user's perspective it usually means that the import is finished, but the imported data is not visible for a long long time (does not seem to be just a fetchTimestampLag, for newly logged-in users with their new sessions and new ECs still don't see the imported data for awhile. Frankly, I can't see what the H. might be the culprit :/ )

(iii) another problem which seems to be also caused (perhaps indirectly) by the above exception is that the application cannot be normally quit from JavaMonitor, reporting upon an attempt

===
04:33:43.441 ERROR Exception caught: null
... ...
IllegalStateException: Attempted to stop the ProcessChangesQueue when it wasn't already running
  at er.extensions.eof.ERXObjectStoreCoordinatorSynchronizer$ProcessChangesQueue.stop(ERXObjectStoreCoordinatorSynchronizer.java:637)
  at er.extensions.eof.ERXObjectStoreCoordinatorSynchronizer.stopRemoteSynchronizer(ERXObjectStoreCoordinatorSynchronizer.java:132)
     ... skipped 8 stack elements
  at er.extensions.appserver.ERXApplication.terminate(ERXApplication.java:2861)
... ...
===

Any idea what might be the culprit and how to fix it?

Thanks and all the best,
OC





Subscribe (FEED) Subscribe (DIGEST) Subscribe (INDEX) Unsubscribe Mail to Listmaster