X-CGP-ClamAV-Result: CLEAN X-VirusScanner: Niversoft's CGPClamav Helper v1.22.2a (ClamAV engine v0.102.2) X-Junk-Score: 0 [] X-KAS-Score: 0 [] From: "OCsite" Received: from smtp-beta-1.zoner.com ([217.198.120.66] verified) by post.selbstdenker.com (CommuniGate Pro SMTP 6.3.3) with ESMTPS id 26008314 for webobjects-dev@wocommunity.org; Fri, 11 Jun 2021 18:20:00 +0200 Received-SPF: none receiver=post.selbstdenker.com; client-ip=217.198.120.66; envelope-from=ocs@ocs.cz Received: from smtp.zoner.com (smtp.zoner.com [217.198.120.6]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp-beta-1.zoner.com (Postfix) with ESMTPS id 8836C1800304 for ; Fri, 11 Jun 2021 18:19:39 +0200 (CEST) Received: from smtpclient.apple (smtp2stechovice.cli-eurosignal.cz [77.240.99.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: ocs@ocs.cz) by smtp.zoner.com (Postfix) with ESMTPSA id 27B1E3000085 for ; Fri, 11 Jun 2021 18:19:39 +0200 (CEST) Content-Type: multipart/alternative; boundary="Apple-Mail=_392F52D5-521E-4967-B04A-FCB908FAFB7D" Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.22\)) Subject: Re: [WO-DEV] Cannot determine primary key for entity Date: Fri, 11 Jun 2021 18:19:38 +0200 References: To: WebObjects & WOnder Development In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3654.100.0.2.22) --Apple-Mail=_392F52D5-521E-4967-B04A-FCB908FAFB7D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Jesse, > On 11. 6. 2021, at 17:47, Jesse Tayler = wrote: > Oh, FrontBase, did you ask those guys about it? You mean the FB support? Nope; so far I rather suppose the problem is at = the EOF side (caused by my code somehow), until proven otherwise. > Now, of course, these EOF logs are from different editing contexts and = perhaps different app/memory space? Different ECs, different OSCs, one application (one instance). Not sure = what a =E2=80=9Cmemory space=E2=80=9D is? > At times in the past, although I do not recall the =E2=80=9Cscience=E2=80= =9D behind this, we thought it best to separate application space = between what users interact, what APIs or machines and finally, any = automated or janitorial type processes on different application spaces. Indeed, nevertheless, at this particular case it would not do. The = background tasks compute some things which need to be shown in the GUI; = they might take a long time though, and therefore they are determined at = background (and when done, displayed through JavaScript. Ajax, more or = less.) > I suppose that has benefits of capacity and load balance but in the = end, you only get one open channel to the database and this may affect = the state of things at very high concurrency. So far I lived under impression that each OSC has its own database = channel? Which is, actually, the very reason we use more of them. Thanks and all the best, OC >=20 > FrontBase has certainly established some extremely high loads in the = past anyway. >=20 >=20 >=20 >> On Jun 11, 2021, at 11:32 AM, OCsite > wrote: >>=20 >> Jesse, >>=20 >>> Did you dump the database=E2=80=99s own logic tree output? >>=20 >> nope, I did not. There are two reasons for that: (a) I don't know how = to :) (the DB is FrontBase, incidentally), (b) even if I knew, I could = not, for it happens on a production site where my access is seriously = limited. So far, all attempts to repeat the problem on a test side = failed :/ >>=20 >> It stinks by a problem at the EOF side anyway. Of course I can't be = sure, but it looks like the DB provides properly, but EOF for some weird = reason sometimes, when the threads happen to clash in a wrong way, = cancels reading in the result set at its side. >>=20 >> To make sure =E2=80=94 is there a debug setting which would log the = SQL results received at the application side? All I know is >> = NSLog.allowDebugLoggingForGroups(NSLog.DebugGroupSQLGeneration|NSLog.Debug= GroupDatabaseAccess|NSLog.DebugGroupEnterpriseObjects) >>=20 >> plus log4j.logger.er.transaction.adaptor.EOAdaptorDebugEnabled=3DDEBUG;= these though give me the SQL sent all right, but I do not see the = results received. >>=20 >> Thanks a lot, >> OC >>=20 >>> On 11. 6. 2021, at 17:18, Jesse Tayler = > = wrote: >>> Did you dump the database=E2=80=99s own logic tree output? Sometimes = you can see the points where it decides on using an index or whatever = and perhaps a failure is either visible or there=E2=80=99s a threshold = limitation below EOF >>>=20 >>> I doubt EOF has the awareness to do other than simply backtrace like = this, so I just wonder if there are lower level reports you can review >>>=20 >>>> On Jun 11, 2021, at 11:08 AM, OCsite = > = wrote: >>>>=20 >>>> P.P.S. It actually does look like the weird cancelled fetch was = somehow affected by the background task; upon further investigation of = the case >>>>=20 >>>>> On 11. 6. 2021, at 15:20, OCsite > wrote: >>>>> =3D=3D=3D >>>>> 15:05:48.528 DEBUG =3D=3D=3D Begin Internal Transaction = //log:NSLog [WorkerThread5] >>>>> 15:05:48.528 DEBUG evaluateExpression: = //log:NSLog [WorkerThread5] >>>>> 15:05:49.937 DEBUG fetch canceled //log:NSLog = [WorkerThread5] >>>>> 15:05:49.937 DEBUG 164 row(s) processed //log:NSLog = [WorkerThread5] >>>>> 15:05:49.941 DEBUG =3D=3D=3D Commit Internal Transaction = //log:NSLog [WorkerThread5] >>>>> 15:05:49.941 INFO Database Exception occured: = java.lang.IllegalArgumentException: Cannot determine primary key for = entity DBRecord from row: {... uid =3D = ; ... } = //log:er.transaction.adaptor.Exceptions [WorkerThread5] >>>>> =3D=3D=3D >>>>=20 >>>> I've found that concurrently a background fetch did run, and = essentially at the same time =E2=80=94 just couple of hundreds of second = later =E2=80=94 was cancelled, too: >>>>=20 >>>> =3D=3D=3D >>>> 15:05:47.355 DEBUG =3D=3D=3D Begin Internal Transaction = //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>>> 15:05:47.355 DEBUG evaluateExpression: = = //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>>> 15:05:49.973 DEBUG fetch canceled //log:NSLog = [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>>> 15:05:49.975 DEBUG 1 row(s) processed //log:NSLog = [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>>> 15:05:49.983 DEBUG =3D=3D=3D Commit Internal Transaction = //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>>> =3D=3D=3D >>>>=20 >>>> Note it runs over a different OSC, but still =E2=80=94 might this = be the culprit? Would EOF somehow cancel a fetch if two of them happen = in the same moment, albeit both of them happen in different ECs over a = different OSC? >>>>=20 >>>> I do not lock here, for there is absolutely no danger more threads = would use the same EC concurrently (though more different background = threads could use concurrently ECs over same OSC). I thought it is all = right in this case. Is it not? Should I try to lock those ECs or even = the OSC? >>>>=20 >>>> Thanks, >>>> OC >>>>=20 >>>>>=20 >>>>> Any idea what might go wrong and how to fix it? Thanks! >>>>> OC >>>>>=20 >>>>>> On 11. 6. 2021, at 13:37, OCsite > wrote: >>>>>>=20 >>>>>> Hi there, >>>>>>=20 >>>>>> just bumped into another weird EOF case. A pretty plain fetch = caused a =E2=80=9CCannot determine primary key for entity=E2=80=9D = exception. The row contains a number of columns whose values makes = sense, some null, some non-null, with one exception =E2=80=94 the = primary key, modelled as an attribute uid, is indeed a null, thus the = exception makes a perfect sense. >>>>>>=20 >>>>>> How can this happen? >>>>>>=20 >>>>>> =3D=3D=3D >>>>>> IllegalArgumentException: Cannot determine primary key for entity = DBRecord from row: {... uid =3D = ; ... } >>>>>> at = com.webobjects.eoaccess.EODatabaseChannel._fetchObject(EODatabaseChannel.j= ava:348) >>>>>> ... skipped 2 stack elements >>>>>> at = com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecific= ation(EOObjectStoreCoordinator.java:488) >>>>>> at = com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(EO= EditingContext.java:4069) >>>>>> at = er.extensions.eof.ERXEC.objectsWithFetchSpecification(ERXEC.java:1215) >>>>>> ... skipped 1 stack elements >>>>>> at = com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsForSourceGlobalID= (EOObjectStoreCoordinator.java:634) >>>>>> at = com.webobjects.eocontrol.EOEditingContext.objectsForSourceGlobalID(EOEditi= ngContext.java:3923) >>>>>> at = er.extensions.eof.ERXEC.objectsForSourceGlobalID(ERXEC.java:1178) >>>>>> ... skipped 1 stack elements >>>>>> at = com.webobjects.eoaccess.EOAccessArrayFaultHandler.completeInitializationOf= Object(EOAccessArrayFaultHandler.java:77) >>>>>> at = com.webobjects.eocontrol._EOCheapCopyMutableArray.willRead(_EOCheapCopyMut= ableArray.java:45) >>>>>> at = com.webobjects.eocontrol._EOCheapCopyMutableArray.count(_EOCheapCopyMutabl= eArray.java:103) >>>>>> at com.webobjects.foundation.NSArray.isEmpty(NSArray.java:1888) >>>>>> ... >>>>>> =3D=3D=3D >>>>>>=20 >>>>>> Just in case it happens to be important (I believe it is not), = the problem happens at row >>>>>>=20 >>>>>> ... =3Deolist.representedObject.records().isEmpty()?...:...= >>>>>>=20 >>>>>> where records just returns storedValueForKey('records'), = self-evidently a fault, which fires to fetch the rows. >>>>>>=20 >>>>>> Searching the Web, all I've found is this = (linked from = here = ), which does not really help :) Truth is, some = background threads do run at the moment; they are comparatively plain = though and I can't see why they should cause the problem for the R/R = thread. All they do is to >>>>>>=20 >>>>>> 1. get their own OSC from the pool, making sure they never get = the same OSC normal sessions have >>>>>> 2. create a new ERXEC in this OSC >>>>>> 3. get a local instance of an object in the EC >>>>>>=20 >>>>>> =3D=3D=3D this is the code of the background thread; a number of = those runs: >>>>>> def store >>>>>> for (def pool=3DERXObjectStoreCoordinatorPool._pool();;) = { >>>>>> store=3Dpool.nextObjectStore >>>>>> if (store!=3D_sessionosc) break // there's one OSC = for all sessions, stored in _sessionosc >>>>>> } >>>>>> return = eo.localInstanceIn(ERXEC.newEditingContext(store)).numberOfMasterRowsWitho= utOwner() >>>>>> =3D=3D=3D >>>>>>=20 >>>>>> and the method simply fetches: >>>>>>=20 >>>>>> =3D=3D=3D >>>>>> int numberOfMasterRowsWithoutOwner { >>>>>> def = mymasterrow=3DEOQualifier.qualifierWithQualifierFormat("importObject.dataB= lock =3D %@ AND recordOwner =3D NULL",[this] as NSA) >>>>>> return = ERXEOControlUtilities.objectCountWithQualifier(this.editingContext, = 'DBRecord', mymasterrow) >>>>>> } >>>>>> =3D=3D=3D >>>>>>=20 >>>>>> Most time it works properly. Occasionally =E2=80=94 rather rarely = =E2=80=94 the problem above happens. Can you see what am I doing wrong? >>>>>>=20 >>>>>> Thanks a lot, >>>>>> OC >>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20 >>>=20 >>=20 >=20 --Apple-Mail=_392F52D5-521E-4967-B04A-FCB908FAFB7D Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Jesse,

On 11. 6. 2021, at 17:47, Jesse = Tayler <webobjects-dev@wocommunity.org> wrote:
Oh, = FrontBase, did you ask those guys about = it?

You mean the = FB support? Nope; so far I rather suppose the problem is at the EOF side = (caused by my code somehow), until proven otherwise.

Now, of course, these EOF = logs are from different editing contexts and perhaps different = app/memory space?

Different ECs, different OSCs, one application = (one instance). Not sure what a =E2=80=9Cmemory space=E2=80=9D = is?

At times in the past, = although I do not recall the =E2=80=9Cscience=E2=80=9D behind this, we = thought it best to separate application space between what users = interact, what APIs or machines and finally, any automated or janitorial = type processes on different application = spaces.

Indeed, = nevertheless, at this particular case it would not do. The background = tasks compute some things which need to be shown in the GUI; they might = take a long time though, and therefore they are determined at background = (and when done, displayed through JavaScript. Ajax, more or = less.)

I suppose that has = benefits of capacity and load balance but in the end, you only get one = open channel to the database and this may affect the state of things at = very high concurrency.

So far I lived under impression that each OSC has = its own database channel? Which is, actually, the very reason we use = more of them.

Thanks and all the = best,
OC


FrontBase has certainly established = some extremely high loads in the past anyway.



On Jun 11, 2021, at 11:32 AM, OCsite <webobjects-dev@wocommunity.org> wrote:

Jesse,

Did you dump = the database=E2=80=99s own logic tree output?

nope, I did not. There = are two reasons for that: (a) I don't know how to :) (the DB is = FrontBase, incidentally), (b) even if I knew, I could not, for it = happens on a production site where my access is seriously limited. So = far, all attempts to repeat the problem on a test side failed = :/

It stinks = by a problem at the EOF side anyway. Of course I can't be sure, but it = looks like the DB provides properly, but EOF for some weird reason = sometimes, when the threads happen to clash in a wrong way, cancels = reading in the result set at its side.

To make sure =E2=80=94 is there a debug = setting which would log the SQL results received at the application = side? All I know is
    =           =   NSLog.allowDebugLoggingForGroups(NSLog.DebugGroupSQLGeneration= |NSLog.DebugGroupDatabaseAccess|NSLog.DebugGroupEnterpriseObjects)

plus log4j.logger.er.transaction.adaptor.EOAdaptorDebugEnabled=3DDEB= UG; these though give me the SQL sent all right, but I do = not see the results received.

Thanks a lot,
OC

On 11. 6. 2021, at 17:18, Jesse Tayler <webobjects-dev@wocommunity.org> wrote:
Did = you dump the database=E2=80=99s own logic tree output? Sometimes you can = see the points where it decides on using an index or whatever and = perhaps a failure is either visible or there=E2=80=99s a threshold = limitation below EOF

I= doubt EOF has the awareness to do other than simply backtrace like = this, so I just wonder if there are lower level reports you can = review

On Jun 11, 2021, at 11:08 AM, = OCsite <webobjects-dev@wocommunity.org> wrote:

P.P.S. It actually does look like the weird cancelled fetch was somehow affected by the background task; upon further = investigation of the case

On 11. = 6. 2021, at 15:20, OCsite <webobjects-dev@wocommunity.org> wrote:
=3D=3D=3D
15:05:48.528 = DEBUG  =3D=3D=3D Begin Internal Transaction       = //log:NSLog [WorkerThread5]
15:05:48.528 DEBUG  evaluateExpression: = <com.webobjects.jdbcadaptor.FrontbasePlugIn$FrontbaseExpression: = "SELECT ... t0."C_UID", ... FROM "T_RECORD" t0 WHERE t0."C_IMPORT_ID" =3D = 1003547" withBindings: >       //log:NSLog = [WorkerThread5]
15:05:49.937 DEBUG fetch canceled     =   //log:NSLog [WorkerThread5]
15:05:49.937 DEBUG 164 row(s) = processed       //log:NSLog = [WorkerThread5]
15:05:49.941 = DEBUG  =3D=3D=3D Commit Internal Transaction       = //log:NSLog [WorkerThread5]
15:05:49.941 INFO  Database Exception occured: = java.lang.IllegalArgumentException: Cannot determine primary key for = entity DBRecord from row: {... uid =3D = <com.webobjects.foundation.NSKeyValueCoding$Null>; ... }   =     //log:er.transaction.adaptor.Exceptions = [WorkerThread5]
=3D=3D=3D

I've found that concurrently a = background fetch did run, and essentially at the same time =E2=80=94 = just couple of hundreds of second later =E2=80=94 was cancelled, = too:

=3D=3D=3D
15:05:47.355 DEBUG  =3D=3D=3D Begin = Internal Transaction       //log:NSLog = [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB]
15:05:47.355 DEBUG  evaluateExpression: = <com.webobjects.jdbcadaptor.FrontbasePlugIn$FrontbaseExpression: = "SELECT count(*) FROM "T_RECORD" t0, "T_IMPORT" T1, "T_IMPORT" T3, = "T_RECORD" T2 WHERE (t0."C_OWNER__ID" is NULL AND T3."C_DATA_BLOCK_ID" =3D= 1000387) AND t0."C_IMPORT_ID" =3D T1."C_UID" AND T2."C_IMPORT_ID" =3D = T3."C_UID" AND T1."C_OWNER_RECORD_ID" =3D T2."C_UID"" withBindings: > =       //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD = n=C3=A1kupy EB]
15:05:49.973 DEBUG fetch canceled     =   //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy = EB]
15:05:49.975 DEBUG 1 row(s) processed       = //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy = EB]
15:05:49.983 DEBUG  =3D=3D=3D Commit Internal = Transaction       //log:NSLog = [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy = EB]
=3D=3D=3D

Note it runs over a different OSC, but = still =E2=80=94 might this be the culprit? Would EOF somehow cancel a = fetch if two of them happen in the same moment, albeit both of them = happen in different ECs over a different OSC?

I do not lock here, for there is = absolutely no danger more threads would use the same EC concurrently = (though more different background threads could use concurrently ECs = over same OSC). I thought it is all right in this case. Is it not? = Should I try to lock those ECs or even the OSC?

Thanks,
OC


Any idea what might go wrong and how to fix it? = Thanks!
OC

On 11. = 6. 2021, at 13:37, OCsite <webobjects-dev@wocommunity.org> wrote:

Hi = there,

just = bumped into another weird EOF case. A pretty plain fetch caused a = =E2=80=9CCannot determine primary key for entity=E2=80=9D exception. The = row contains a number of columns whose values makes sense, some null, = some non-null, with one exception =E2=80=94 the primary key, modelled as = an attribute uid, is indeed a null, thus the exception makes a perfect = sense.

How can this = happen?

=3D=3D=3D
IllegalArgumentException: = Cannot determine primary key for entity DBRecord from row: {... uid =3D = <com.webobjects.foundation.NSKeyValueCoding$Null>; ... = }
  at = com.webobjects.eoaccess.EODatabaseChannel._fetchObject(EODatabaseChannel.j= ava:348)
    =  ... skipped 2 stack elements
  at = com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecific= ation(EOObjectStoreCoordinator.java:488)
  at = com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(EO= EditingContext.java:4069)
  at = er.extensions.eof.ERXEC.objectsWithFetchSpecification(ERXEC.java:1215)
     ... skipped = 1 stack elements
  at = com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsForSourceGlobalID= (EOObjectStoreCoordinator.java:634)
  at = com.webobjects.eocontrol.EOEditingContext.objectsForSourceGlobalID(EOEditi= ngContext.java:3923)
  at = er.extensions.eof.ERXEC.objectsForSourceGlobalID(ERXEC.java:1178)
     ... skipped = 1 stack elements
  at = com.webobjects.eoaccess.EOAccessArrayFaultHandler.completeInitializationOf= Object(EOAccessArrayFaultHandler.java:77)
  at = com.webobjects.eocontrol._EOCheapCopyMutableArray.willRead(_EOCheapCopyMut= ableArray.java:45)
  at = com.webobjects.eocontrol._EOCheapCopyMutableArray.count(_EOCheapCopyMutabl= eArray.java:103)
  at = com.webobjects.foundation.NSArray.isEmpty(NSArray.java:1888)=
...
=3D=3D=3D

Just in case it happens to be important (I = believe it is not), the problem happens at row

  =       ... = =3Deolist.representedObject.records().isEmpty()?...:...

where records = just returns storedValueForKey('records'), = self-evidently a fault, which fires to fetch the rows.

Searching = the Web, all I've found is this (linked from here), which does not really = help :) Truth is, some background threads do run at = the moment; they are comparatively plain though and I can't see why they = should cause the problem for the R/R thread. All they do is to

1. get their own OSC = from the pool, making sure they never get = the same OSC normal sessions have
2. = create a new ERXEC in this OSC
3. get a local = instance of an object in the EC

=3D=3D=3D this is the code of the = background thread; a number of those runs:
        def store
    =     for = (def = pool=3DERXObjectStoreCoordinatorPool._pool();;) {
            = store=3Dpool.nextObjectStore
        =     if (store!=3D_sessionosc) = break // there's one OSC for all sessions, = stored in _sessionosc
      =   }
      =   return eo.localInstanceIn(ERXEC.newEditingContext(s= tore)).numberOfMasterRowsWithoutOwner()
=3D=3D=3D

and the method simply = fetches:

=3D=3D=3D
  =   int = numberOfMasterRowsWithoutOwner {
      =   def = mymasterrow=3DEOQualifier.qualifierWithQualifierFormat("importObject.dataBlock =3D = %@ AND recordOwner =3D NULL",[this] as NSA)
        return ERXEOControlUtilities.objectCountWithQualifier(this.editingContext, 'DBRecord', mymasterrow)
    }
=3D=3D=3D

Most time it works = properly. Occasionally =E2=80=94 rather rarely =E2=80=94 the problem = above happens. Can you see what am I doing wrong?
Thanks a lot,
OC








= --Apple-Mail=_392F52D5-521E-4967-B04A-FCB908FAFB7D--