Oh, FrontBase, did you ask those guys about it?
Now, of course, these EOF logs are from different editing contexts and perhaps different app/memory space?
At times in the past, although I do not recall the “science” behind this, we thought it best to separate application space between what users interact, what APIs or machines and finally, any automated or janitorial type processes on different application spaces.
I suppose that has benefits of capacity and load balance but in the end, you only get one open channel to the database and this may affect the state of things at very high concurrency.
FrontBase has certainly established some extremely high loads in the past anyway.
Jesse,
Did you dump the database’s own logic tree output?
nope, I did not. There are two reasons for that: (a) I don't know how to :) (the DB is FrontBase, incidentally), (b) even if I knew, I could not, for it happens on a production site where my access is seriously limited. So far, all attempts to repeat the problem on a test side failed :/
It stinks by a problem at the EOF side anyway. Of course I can't be sure, but it looks like the DB provides properly, but EOF for some weird reason sometimes, when the threads happen to clash in a wrong way, cancels reading in the result set at its side.
To make sure — is there a debug setting which would log the SQL results received at the application side? All I know is NSLog.allowDebugLoggingForGroups(NSLog.DebugGroupSQLGeneration|NSLog.DebugGroupDatabaseAccess|NSLog.DebugGroupEnterpriseObjects)
plus log4j.logger.er.transaction.adaptor.EOAdaptorDebugEnabled=DEBUG; these though give me the SQL sent all right, but I do not see the results received.
Thanks a lot, OC
Did you dump the database’s own logic tree output? Sometimes you can see the points where it decides on using an index or whatever and perhaps a failure is either visible or there’s a threshold limitation below EOF
I doubt EOF has the awareness to do other than simply backtrace like this, so I just wonder if there are lower level reports you can review
P.P.S. It actually does look like the weird cancelled fetch was somehow affected by the background task; upon further investigation of the case=== 15:05:48.528 DEBUG === Begin Internal Transaction //log:NSLog [WorkerThread5] 15:05:48.528 DEBUG evaluateExpression: <com.webobjects.jdbcadaptor.FrontbasePlugIn$FrontbaseExpression: "SELECT ... t0."C_UID", ... FROM "T_RECORD" t0 WHERE t0."C_IMPORT_ID" = 1003547" withBindings: > //log:NSLog [WorkerThread5] 15:05:49.937 DEBUG fetch canceled //log:NSLog [WorkerThread5] 15:05:49.937 DEBUG 164 row(s) processed //log:NSLog [WorkerThread5] 15:05:49.941 DEBUG === Commit Internal Transaction //log:NSLog [WorkerThread5] 15:05:49.941 INFO Database Exception occured: java.lang.IllegalArgumentException: Cannot determine primary key for entity DBRecord from row: {... uid = <com.webobjects.foundation.NSKeyValueCoding$Null>; ... } //log:er.transaction.adaptor.Exceptions [WorkerThread5] ===
I've found that concurrently a background fetch did run, and essentially at the same time — just couple of hundreds of second later — was cancelled, too:
=== 15:05:47.355 DEBUG === Begin Internal Transaction //log:NSLog [MainPageSlaveRowsCountThread_Cizí nákupy EB] 15:05:47.355 DEBUG evaluateExpression: <com.webobjects.jdbcadaptor.FrontbasePlugIn$FrontbaseExpression: "SELECT count(*) FROM "T_RECORD" t0, "T_IMPORT" T1, "T_IMPORT" T3, "T_RECORD" T2 WHERE (t0."C_OWNER__ID" is NULL AND T3."C_DATA_BLOCK_ID" = 1000387) AND t0."C_IMPORT_ID" = T1."C_UID" AND T2."C_IMPORT_ID" = T3."C_UID" AND T1."C_OWNER_RECORD_ID" = T2."C_UID"" withBindings: > //log:NSLog [MainPageSlaveRowsCountThread_Cizí nákupy EB] 15:05:49.973 DEBUG fetch canceled //log:NSLog [MainPageSlaveRowsCountThread_Cizí nákupy EB] 15:05:49.975 DEBUG 1 row(s) processed //log:NSLog [MainPageSlaveRowsCountThread_Cizí nákupy EB] 15:05:49.983 DEBUG === Commit Internal Transaction //log:NSLog [MainPageSlaveRowsCountThread_Cizí nákupy EB] ===
Note it runs over a different OSC, but still — might this be the culprit? Would EOF somehow cancel a fetch if two of them happen in the same moment, albeit both of them happen in different ECs over a different OSC?
I do not lock here, for there is absolutely no danger more threads would use the same EC concurrently (though more different background threads could use concurrently ECs over same OSC). I thought it is all right in this case. Is it not? Should I try to lock those ECs or even the OSC?
Thanks, OC
Any idea what might go wrong and how to fix it? Thanks! OC
Hi there,
just bumped into another weird EOF case. A pretty plain fetch caused a “Cannot determine primary key for entity” exception. The row contains a number of columns whose values makes sense, some null, some non-null, with one exception — the primary key, modelled as an attribute uid, is indeed a null, thus the exception makes a perfect sense.
How can this happen?
=== IllegalArgumentException: Cannot determine primary key for entity DBRecord from row: {... uid = <com.webobjects.foundation.NSKeyValueCoding$Null>; ... } at com.webobjects.eoaccess.EODatabaseChannel._fetchObject(EODatabaseChannel.java:348) ... skipped 2 stack elements at com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecification(EOObjectStoreCoordinator.java:488) at com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(EOEditingContext.java:4069) at er.extensions.eof.ERXEC.objectsWithFetchSpecification(ERXEC.java:1215) ... skipped 1 stack elements at com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsForSourceGlobalID(EOObjectStoreCoordinator.java:634) at com.webobjects.eocontrol.EOEditingContext.objectsForSourceGlobalID(EOEditingContext.java:3923) at er.extensions.eof.ERXEC.objectsForSourceGlobalID(ERXEC.java:1178) ... skipped 1 stack elements at com.webobjects.eoaccess.EOAccessArrayFaultHandler.completeInitializationOfObject(EOAccessArrayFaultHandler.java:77) at com.webobjects.eocontrol._EOCheapCopyMutableArray.willRead(_EOCheapCopyMutableArray.java:45) at com.webobjects.eocontrol._EOCheapCopyMutableArray.count(_EOCheapCopyMutableArray.java:103) at com.webobjects.foundation.NSArray.isEmpty(NSArray.java:1888) ... ===
Just in case it happens to be important (I believe it is not), the problem happens at row
... =eolist.representedObject.records().isEmpty()?...:...
where records just returns storedValueForKey('records'), self-evidently a fault, which fires to fetch the rows.
Searching the Web, all I've found is this (linked from here), which does not really help :) Truth is, some background threads do run at the moment; they are comparatively plain though and I can't see why they should cause the problem for the R/R thread. All they do is to
1. get their own OSC from the pool, making sure they never get the same OSC normal sessions have 2. create a new ERXEC in this OSC 3. get a local instance of an object in the EC
=== this is the code of the background thread; a number of those runs: def store for (def pool=ERXObjectStoreCoordinatorPool._pool();;) { store=pool.nextObjectStore if (store!=_sessionosc) break // there's one OSC for all sessions, stored in _sessionosc } return eo.localInstanceIn(ERXEC.newEditingContext(store)).numberOfMasterRowsWithoutOwner() ===
and the method simply fetches:
=== int numberOfMasterRowsWithoutOwner { def mymasterrow=EOQualifier.qualifierWithQualifierFormat("importObject.dataBlock = %@ AND recordOwner = NULL",[this] as NSA) return ERXEOControlUtilities.objectCountWithQualifier(this.editingContext, 'DBRecord', mymasterrow) } ===
Most time it works properly. Occasionally — rather rarely — the problem above happens. Can you see what am I doing wrong?
Thanks a lot, OC
|