X-CGP-ClamAV-Result: CLEAN X-VirusScanner: Niversoft's CGPClamav Helper v1.22.2a (ClamAV engine v0.102.2) X-Junk-Score: 0 [] X-KAS-Score: 0 [] From: "Jesse Tayler" Received: from mail-qt1-f171.google.com ([209.85.160.171] verified) by selbstdenker.ag (CommuniGate Pro SMTP 6.3.3) with ESMTPS id 26008223 for webobjects-dev@wocommunity.org; Fri, 11 Jun 2021 17:48:06 +0200 Received-SPF: none receiver=post.selbstdenker.com; client-ip=209.85.160.171; envelope-from=jtayler@oeinc.com Received: by mail-qt1-f171.google.com with SMTP id l17so2886076qtq.12 for ; Fri, 11 Jun 2021 08:48:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oeinc-com.20150623.gappssmtp.com; s=20150623; h=from:mime-version:subject:date:references:to:in-reply-to:message-id; bh=GtJhIseCiUb0UI3IVwKpprNQ5gLUBPtfk+nxPCCcMfg=; b=w6JqdFukWdeR/z7gTqYStQlzBPhsIVmW/NV7ekLEMthYoUFQ1dYw7PMjNstbvpN/j0 G7sZU/SBd6wz+ghYoTnvmvK3fQJ8ZCIX/5Oskdmf5gyn7gPgQWqHo2wMqnNk3YvBO2T5 Wf+iOFtsBKE4jGT2prcwzvYzyKvf7ZtEKltrqN9ujS1OLvbWVmF6t2MwjCL76OHZ/H0p +YPEVjLxjgtOhpes2YvXCZF74NXqrO98erPJP3Q5DaZP50+7xuYswb4HuWgTNf0RjAS6 BJVpLOSk6j+s7krdBT6bPfJh+NIik5P3s2ov99sq8I5FhgcgeEln0ex3GPdhuUueUD0e +zRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:subject:date:references:to :in-reply-to:message-id; bh=GtJhIseCiUb0UI3IVwKpprNQ5gLUBPtfk+nxPCCcMfg=; b=dyMhuAMY6D0MC9PCyyC4PeIzVExV4wHJVBPjNZj+KXdkNY1yErR0ukDCkATjGf+BQa +wzN1Eozw21Cl9Cykar1h4cpHSkVYI+JOgErmo3JcfT+PpMPDsdOBvI7tAnsfYJtLPJn M6++bXfVo+7Ysh4+ZHizuIuWqoh9Ib3AKeg5stDEdKpwUaVSbeHktkiQL/kFCEdUsiBj boNC+qq9k5s3ZRwlllxyQMsRZ9XyklMOs9Vd0XsSUaiQrCJIGdGxxmpleCVOjZHDnL5Z vUQKuTcC18dfOGqcziEYJ4XdkLbZpFgSmD6+H2l1qaRdMntMFCS0uyRmcE4KXSq9YeVV 5nzw== X-Gm-Message-State: AOAM530gD9ZH7rMcPaFgJdafN5cpQcQH7l2WI9WHt4rxSk5DYm7hfpyY S8Bn1xKUpivm9AycJccrkAosl3F6sBaQ/ejv X-Google-Smtp-Source: ABdhPJzHzi8p594vwoAE5Dxaw9QcWfFYDx6Kv1rXGcNPcQ5sJhuhl/yTJn9KKPJhI+B5qiYeCg/mJg== X-Received: by 2002:ac8:6686:: with SMTP id d6mr3001432qtp.51.1623426464909; Fri, 11 Jun 2021 08:47:44 -0700 (PDT) Return-Path: Received: from [192.168.1.16] ([69.169.5.176]) by smtp.gmail.com with ESMTPSA id i11sm4486084qtq.58.2021.06.11.08.47.44 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Jun 2021 08:47:44 -0700 (PDT) Content-Type: multipart/alternative; boundary="Apple-Mail=_BDF4B7EC-0234-41CE-B384-D65AFD9977FD" Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.2\)) Subject: Re: [WO-DEV] Cannot determine primary key for entity Date: Fri, 11 Jun 2021 11:47:44 -0400 References: To: WebObjects & WOnder Development In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3654.60.0.2.2) --Apple-Mail=_BDF4B7EC-0234-41CE-B384-D65AFD9977FD Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Oh, FrontBase, did you ask those guys about it? Now, of course, these EOF logs are from different editing contexts and = perhaps different app/memory space? At times in the past, although I do not recall the =E2=80=9Cscience=E2=80=9D= behind this, we thought it best to separate application space between = what users interact, what APIs or machines and finally, any automated or = janitorial type processes on different application spaces. I suppose that has benefits of capacity and load balance but in the end, = you only get one open channel to the database and this may affect the = state of things at very high concurrency. FrontBase has certainly established some extremely high loads in the = past anyway. > On Jun 11, 2021, at 11:32 AM, OCsite = wrote: >=20 > Jesse, >=20 >> Did you dump the database=E2=80=99s own logic tree output? >=20 > nope, I did not. There are two reasons for that: (a) I don't know how = to :) (the DB is FrontBase, incidentally), (b) even if I knew, I could = not, for it happens on a production site where my access is seriously = limited. So far, all attempts to repeat the problem on a test side = failed :/ >=20 > It stinks by a problem at the EOF side anyway. Of course I can't be = sure, but it looks like the DB provides properly, but EOF for some weird = reason sometimes, when the threads happen to clash in a wrong way, = cancels reading in the result set at its side. >=20 > To make sure =E2=80=94 is there a debug setting which would log the = SQL results received at the application side? All I know is > = NSLog.allowDebugLoggingForGroups(NSLog.DebugGroupSQLGeneration|NSLog.Debug= GroupDatabaseAccess|NSLog.DebugGroupEnterpriseObjects) >=20 > plus log4j.logger.er.transaction.adaptor.EOAdaptorDebugEnabled=3DDEBUG; = these though give me the SQL sent all right, but I do not see the = results received. >=20 > Thanks a lot, > OC >=20 >> On 11. 6. 2021, at 17:18, Jesse Tayler = > = wrote: >> Did you dump the database=E2=80=99s own logic tree output? Sometimes = you can see the points where it decides on using an index or whatever = and perhaps a failure is either visible or there=E2=80=99s a threshold = limitation below EOF >>=20 >> I doubt EOF has the awareness to do other than simply backtrace like = this, so I just wonder if there are lower level reports you can review >>=20 >>> On Jun 11, 2021, at 11:08 AM, OCsite > wrote: >>>=20 >>> P.P.S. It actually does look like the weird cancelled fetch was = somehow affected by the background task; upon further investigation of = the case >>>=20 >>>> On 11. 6. 2021, at 15:20, OCsite > wrote: >>>> =3D=3D=3D >>>> 15:05:48.528 DEBUG =3D=3D=3D Begin Internal Transaction = //log:NSLog [WorkerThread5] >>>> 15:05:48.528 DEBUG evaluateExpression: = //log:NSLog [WorkerThread5] >>>> 15:05:49.937 DEBUG fetch canceled //log:NSLog [WorkerThread5] >>>> 15:05:49.937 DEBUG 164 row(s) processed //log:NSLog = [WorkerThread5] >>>> 15:05:49.941 DEBUG =3D=3D=3D Commit Internal Transaction = //log:NSLog [WorkerThread5] >>>> 15:05:49.941 INFO Database Exception occured: = java.lang.IllegalArgumentException: Cannot determine primary key for = entity DBRecord from row: {... uid =3D = ; ... } = //log:er.transaction.adaptor.Exceptions [WorkerThread5] >>>> =3D=3D=3D >>>=20 >>> I've found that concurrently a background fetch did run, and = essentially at the same time =E2=80=94 just couple of hundreds of second = later =E2=80=94 was cancelled, too: >>>=20 >>> =3D=3D=3D >>> 15:05:47.355 DEBUG =3D=3D=3D Begin Internal Transaction = //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>> 15:05:47.355 DEBUG evaluateExpression: = = //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>> 15:05:49.973 DEBUG fetch canceled //log:NSLog = [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>> 15:05:49.975 DEBUG 1 row(s) processed //log:NSLog = [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>> 15:05:49.983 DEBUG =3D=3D=3D Commit Internal Transaction = //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB] >>> =3D=3D=3D >>>=20 >>> Note it runs over a different OSC, but still =E2=80=94 might this be = the culprit? Would EOF somehow cancel a fetch if two of them happen in = the same moment, albeit both of them happen in different ECs over a = different OSC? >>>=20 >>> I do not lock here, for there is absolutely no danger more threads = would use the same EC concurrently (though more different background = threads could use concurrently ECs over same OSC). I thought it is all = right in this case. Is it not? Should I try to lock those ECs or even = the OSC? >>>=20 >>> Thanks, >>> OC >>>=20 >>>>=20 >>>> Any idea what might go wrong and how to fix it? Thanks! >>>> OC >>>>=20 >>>>> On 11. 6. 2021, at 13:37, OCsite > wrote: >>>>>=20 >>>>> Hi there, >>>>>=20 >>>>> just bumped into another weird EOF case. A pretty plain fetch = caused a =E2=80=9CCannot determine primary key for entity=E2=80=9D = exception. The row contains a number of columns whose values makes = sense, some null, some non-null, with one exception =E2=80=94 the = primary key, modelled as an attribute uid, is indeed a null, thus the = exception makes a perfect sense. >>>>>=20 >>>>> How can this happen? >>>>>=20 >>>>> =3D=3D=3D >>>>> IllegalArgumentException: Cannot determine primary key for entity = DBRecord from row: {... uid =3D = ; ... } >>>>> at = com.webobjects.eoaccess.EODatabaseChannel._fetchObject(EODatabaseChannel.j= ava:348) >>>>> ... skipped 2 stack elements >>>>> at = com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecific= ation(EOObjectStoreCoordinator.java:488) >>>>> at = com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(EO= EditingContext.java:4069) >>>>> at = er.extensions.eof.ERXEC.objectsWithFetchSpecification(ERXEC.java:1215) >>>>> ... skipped 1 stack elements >>>>> at = com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsForSourceGlobalID= (EOObjectStoreCoordinator.java:634) >>>>> at = com.webobjects.eocontrol.EOEditingContext.objectsForSourceGlobalID(EOEditi= ngContext.java:3923) >>>>> at = er.extensions.eof.ERXEC.objectsForSourceGlobalID(ERXEC.java:1178) >>>>> ... skipped 1 stack elements >>>>> at = com.webobjects.eoaccess.EOAccessArrayFaultHandler.completeInitializationOf= Object(EOAccessArrayFaultHandler.java:77) >>>>> at = com.webobjects.eocontrol._EOCheapCopyMutableArray.willRead(_EOCheapCopyMut= ableArray.java:45) >>>>> at = com.webobjects.eocontrol._EOCheapCopyMutableArray.count(_EOCheapCopyMutabl= eArray.java:103) >>>>> at com.webobjects.foundation.NSArray.isEmpty(NSArray.java:1888) >>>>> ... >>>>> =3D=3D=3D >>>>>=20 >>>>> Just in case it happens to be important (I believe it is not), the = problem happens at row >>>>>=20 >>>>> ... =3Deolist.representedObject.records().isEmpty()?...:... >>>>>=20 >>>>> where records just returns storedValueForKey('records'), = self-evidently a fault, which fires to fetch the rows. >>>>>=20 >>>>> Searching the Web, all I've found is this = (linked from = here = ), which does not really help :) Truth is, some = background threads do run at the moment; they are comparatively plain = though and I can't see why they should cause the problem for the R/R = thread. All they do is to >>>>>=20 >>>>> 1. get their own OSC from the pool, making sure they never get the = same OSC normal sessions have >>>>> 2. create a new ERXEC in this OSC >>>>> 3. get a local instance of an object in the EC >>>>>=20 >>>>> =3D=3D=3D this is the code of the background thread; a number of = those runs: >>>>> def store >>>>> for (def pool=3DERXObjectStoreCoordinatorPool._pool();;) { >>>>> store=3Dpool.nextObjectStore >>>>> if (store!=3D_sessionosc) break // there's one OSC for = all sessions, stored in _sessionosc >>>>> } >>>>> return = eo.localInstanceIn(ERXEC.newEditingContext(store)).numberOfMasterRowsWitho= utOwner() >>>>> =3D=3D=3D >>>>>=20 >>>>> and the method simply fetches: >>>>>=20 >>>>> =3D=3D=3D >>>>> int numberOfMasterRowsWithoutOwner { >>>>> def = mymasterrow=3DEOQualifier.qualifierWithQualifierFormat("importObject.dataB= lock =3D %@ AND recordOwner =3D NULL",[this] as NSA) >>>>> return = ERXEOControlUtilities.objectCountWithQualifier(this.editingContext, = 'DBRecord', mymasterrow) >>>>> } >>>>> =3D=3D=3D >>>>>=20 >>>>> Most time it works properly. Occasionally =E2=80=94 rather rarely = =E2=80=94 the problem above happens. Can you see what am I doing wrong? >>>>>=20 >>>>> Thanks a lot, >>>>> OC >>>>>=20 >>>>>=20 >>>>=20 >>>=20 >>=20 >=20 --Apple-Mail=_BDF4B7EC-0234-41CE-B384-D65AFD9977FD Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Oh, = FrontBase, did you ask those guys about it?

Now, of course, these EOF logs are from = different editing contexts and perhaps different app/memory = space?

At = times in the past, although I do not recall the =E2=80=9Cscience=E2=80=9D = behind this, we thought it best to separate application space between = what users interact, what APIs or machines and finally, any automated or = janitorial type processes on different application spaces.

I suppose that has = benefits of capacity and load balance but in the end, you only get one = open channel to the database and this may affect the state of things at = very high concurrency.

FrontBase has certainly established some extremely high loads = in the past anyway.



On Jun = 11, 2021, at 11:32 AM, OCsite <webobjects-dev@wocommunity.org> wrote:

Jesse,

Did you dump = the database=E2=80=99s own logic tree output?

nope, I did not. There = are two reasons for that: (a) I don't know how to :) (the DB is = FrontBase, incidentally), (b) even if I knew, I could not, for it = happens on a production site where my access is seriously limited. So = far, all attempts to repeat the problem on a test side failed = :/

It stinks = by a problem at the EOF side anyway. Of course I can't be sure, but it = looks like the DB provides properly, but EOF for some weird reason = sometimes, when the threads happen to clash in a wrong way, cancels = reading in the result set at its side.

To make sure =E2=80=94 is there a debug = setting which would log the SQL results received at the application = side? All I know is
    =           =   NSLog.allowDebugLoggingForGroups(NSLog.DebugGroupSQLGeneration= |NSLog.DebugGroupDatabaseAccess|NSLog.DebugGroupEnterpriseObjects)

plus log4j.logger.er.transaction.adaptor.EOAdaptorDebugEnabled=3DDEB= UG; these though give me the SQL sent all right, but I do = not see the results received.

Thanks a lot,
OC

On 11. 6. 2021, at 17:18, Jesse Tayler <webobjects-dev@wocommunity.org> wrote:
Did = you dump the database=E2=80=99s own logic tree output? Sometimes you can = see the points where it decides on using an index or whatever and = perhaps a failure is either visible or there=E2=80=99s a threshold = limitation below EOF

I= doubt EOF has the awareness to do other than simply backtrace like = this, so I just wonder if there are lower level reports you can = review

On Jun 11, 2021, at 11:08 AM, = OCsite <webobjects-dev@wocommunity.org> wrote:

P.P.S. It actually does look like the weird cancelled fetch was somehow affected by the background task; upon further = investigation of the case

On 11. = 6. 2021, at 15:20, OCsite <webobjects-dev@wocommunity.org> wrote:
=3D=3D=3D
15:05:48.528 = DEBUG  =3D=3D=3D Begin Internal Transaction       = //log:NSLog [WorkerThread5]
15:05:48.528 DEBUG  evaluateExpression: = <com.webobjects.jdbcadaptor.FrontbasePlugIn$FrontbaseExpression: = "SELECT ... t0."C_UID", ... FROM "T_RECORD" t0 WHERE t0."C_IMPORT_ID" =3D = 1003547" withBindings: >       //log:NSLog = [WorkerThread5]
15:05:49.937 DEBUG fetch canceled     =   //log:NSLog [WorkerThread5]
15:05:49.937 DEBUG 164 row(s) = processed       //log:NSLog = [WorkerThread5]
15:05:49.941 = DEBUG  =3D=3D=3D Commit Internal Transaction       = //log:NSLog [WorkerThread5]
15:05:49.941 INFO  Database Exception occured: = java.lang.IllegalArgumentException: Cannot determine primary key for = entity DBRecord from row: {... uid =3D = <com.webobjects.foundation.NSKeyValueCoding$Null>; ... }   =     //log:er.transaction.adaptor.Exceptions = [WorkerThread5]
=3D=3D=3D

I've found that concurrently a = background fetch did run, and essentially at the same time =E2=80=94 = just couple of hundreds of second later =E2=80=94 was cancelled, = too:

=3D=3D=3D
15:05:47.355 DEBUG  =3D=3D=3D Begin = Internal Transaction       //log:NSLog = [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy EB]
15:05:47.355 DEBUG  evaluateExpression: = <com.webobjects.jdbcadaptor.FrontbasePlugIn$FrontbaseExpression: = "SELECT count(*) FROM "T_RECORD" t0, "T_IMPORT" T1, "T_IMPORT" T3, = "T_RECORD" T2 WHERE (t0."C_OWNER__ID" is NULL AND T3."C_DATA_BLOCK_ID" =3D= 1000387) AND t0."C_IMPORT_ID" =3D T1."C_UID" AND T2."C_IMPORT_ID" =3D = T3."C_UID" AND T1."C_OWNER_RECORD_ID" =3D T2."C_UID"" withBindings: > =       //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD = n=C3=A1kupy EB]
15:05:49.973 DEBUG fetch canceled     =   //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy = EB]
15:05:49.975 DEBUG 1 row(s) processed       = //log:NSLog [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy = EB]
15:05:49.983 DEBUG  =3D=3D=3D Commit Internal = Transaction       //log:NSLog = [MainPageSlaveRowsCountThread_Ciz=C3=AD n=C3=A1kupy = EB]
=3D=3D=3D

Note it runs over a different OSC, but = still =E2=80=94 might this be the culprit? Would EOF somehow cancel a = fetch if two of them happen in the same moment, albeit both of them = happen in different ECs over a different OSC?

I do not lock here, for there is = absolutely no danger more threads would use the same EC concurrently = (though more different background threads could use concurrently ECs = over same OSC). I thought it is all right in this case. Is it not? = Should I try to lock those ECs or even the OSC?

Thanks,
OC


Any idea what might go wrong and how to fix it? = Thanks!
OC

On 11. = 6. 2021, at 13:37, OCsite <webobjects-dev@wocommunity.org> wrote:

Hi = there,

just = bumped into another weird EOF case. A pretty plain fetch caused a = =E2=80=9CCannot determine primary key for entity=E2=80=9D exception. The = row contains a number of columns whose values makes sense, some null, = some non-null, with one exception =E2=80=94 the primary key, modelled as = an attribute uid, is indeed a null, thus the exception makes a perfect = sense.

How can this = happen?

=3D=3D=3D
IllegalArgumentException: = Cannot determine primary key for entity DBRecord from row: {... uid =3D = <com.webobjects.foundation.NSKeyValueCoding$Null>; ... = }
  at = com.webobjects.eoaccess.EODatabaseChannel._fetchObject(EODatabaseChannel.j= ava:348)
    =  ... skipped 2 stack elements
  at = com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsWithFetchSpecific= ation(EOObjectStoreCoordinator.java:488)
  at = com.webobjects.eocontrol.EOEditingContext.objectsWithFetchSpecification(EO= EditingContext.java:4069)
  at = er.extensions.eof.ERXEC.objectsWithFetchSpecification(ERXEC.java:1215)
     ... skipped = 1 stack elements
  at = com.webobjects.eocontrol.EOObjectStoreCoordinator.objectsForSourceGlobalID= (EOObjectStoreCoordinator.java:634)
  at = com.webobjects.eocontrol.EOEditingContext.objectsForSourceGlobalID(EOEditi= ngContext.java:3923)
  at = er.extensions.eof.ERXEC.objectsForSourceGlobalID(ERXEC.java:1178)
     ... skipped = 1 stack elements
  at = com.webobjects.eoaccess.EOAccessArrayFaultHandler.completeInitializationOf= Object(EOAccessArrayFaultHandler.java:77)
  at = com.webobjects.eocontrol._EOCheapCopyMutableArray.willRead(_EOCheapCopyMut= ableArray.java:45)
  at = com.webobjects.eocontrol._EOCheapCopyMutableArray.count(_EOCheapCopyMutabl= eArray.java:103)
  at = com.webobjects.foundation.NSArray.isEmpty(NSArray.java:1888)=
...
=3D=3D=3D

Just in case it happens to be important (I = believe it is not), the problem happens at row

  =       ... = =3Deolist.representedObject.records().isEmpty()?...:...

where records = just returns storedValueForKey('records'), = self-evidently a fault, which fires to fetch the rows.

Searching = the Web, all I've found is this (linked from here), which does not really = help :) Truth is, some background threads do run at = the moment; they are comparatively plain though and I can't see why they = should cause the problem for the R/R thread. All they do is to

1. get their own OSC = from the pool, making sure they never get = the same OSC normal sessions have
2. = create a new ERXEC in this OSC
3. get a local = instance of an object in the EC

=3D=3D=3D this is the code of the = background thread; a number of those runs:
        def store
    =     for = (def = pool=3DERXObjectStoreCoordinatorPool._pool();;) {
            = store=3Dpool.nextObjectStore
        =     if (store!=3D_sessionosc) = break // there's one OSC for all sessions, = stored in _sessionosc
      =   }
      =   return eo.localInstanceIn(ERXEC.newEditingContext(s= tore)).numberOfMasterRowsWithoutOwner()
=3D=3D=3D

and the method simply = fetches:

=3D=3D=3D
  =   int = numberOfMasterRowsWithoutOwner {
      =   def = mymasterrow=3DEOQualifier.qualifierWithQualifierFormat("importObject.dataBlock =3D = %@ AND recordOwner =3D NULL",[this] as NSA)
        return ERXEOControlUtilities.objectCountWithQualifier(this.editingContext, 'DBRecord', mymasterrow)
    }
=3D=3D=3D

Most time it works = properly. Occasionally =E2=80=94 rather rarely =E2=80=94 the problem = above happens. Can you see what am I doing wrong?
Thanks a lot,
OC







= --Apple-Mail=_BDF4B7EC-0234-41CE-B384-D65AFD9977FD--