X-CGP-ClamAV-Result: CLEAN X-VirusScanner: Niversoft's CGPClamav Helper v1.22.2a (ClamAV engine v0.102.2) X-Junk-Score: 0 [] X-KAS-Score: 0 [] From: "OCsite" Received: from smtp-beta-2.zoner.com ([217.198.120.71] verified) by post.selbstdenker.com (CommuniGate Pro SMTP 6.3.3) with ESMTPS id 26038026 for webobjects-dev@wocommunity.org; Thu, 17 Jun 2021 21:37:32 +0200 Received-SPF: none receiver=post.selbstdenker.com; client-ip=217.198.120.71; envelope-from=ocs@ocs.cz Received: from smtp.zoner.com (smtp.zoner.com [217.198.120.6]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp-beta-2.zoner.com (Postfix) with ESMTPS id 701301800121 for ; Thu, 17 Jun 2021 21:37:11 +0200 (CEST) Received: from smtpclient.apple (smtp2stechovice.cli-eurosignal.cz [77.240.99.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: ocs@ocs.cz) by smtp.zoner.com (Postfix) with ESMTPSA id E61533000079 for ; Thu, 17 Jun 2021 21:37:10 +0200 (CEST) Content-Type: multipart/alternative; boundary="Apple-Mail=_C33A518F-979D-4DEC-A2C5-2EA4C90CA902" Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.22\)) Subject: Work-around (was: [WO-DEV] old/invalid :1 relationship value, how the H. possible?!?) Date: Thu, 17 Jun 2021 21:37:10 +0200 References: To: WebObjects & WOnder Development In-Reply-To: Message-Id: <45515297-5E43-4BBC-BAA5-3EA47504F6F7@ocs.cz> X-Mailer: Apple Mail (2.3654.100.0.2.22) --Apple-Mail=_C33A518F-979D-4DEC-A2C5-2EA4C90CA902 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Given it did happen =E2=80=94 for whichever weird and arcane reason =E2=80= =94 at least once that after being unlocked/locked an EC still did not = have changes saved previously (before the unlock/lock), I am considering = to add this work-around for the one very important attribute (well, :1 = relationship, but that's essentially same thing): - whenever its value is being read from an eo, I also get its value from = the snapshot[1] and compare them - if they match, I simply use the value, it's definitely all right - if they do not match, I'll check whether the eo is in the = updatedObjects of its EC. If so, I'll again use the current value, it's = probably all right[2] - if they do not match and the eo is not an updated one, I simply = refault it. This should - cost essentially nothing in the most common case (access to snapshot = is very fast) - fix the problem =E2=80=94 should it happen again =E2=80=94 with a very = high probability[2] - cost little even in the worst case (refault + re-read from snapshot is = still pretty fast) Do I overlook something of importance? [1] Still not sure which one and how all those beasts differ, but I = guess EODatabaseContext.snapshotForGlobalID should be most reliable. = Hopefully. [2] There's some danger another attribute/relationship has been changed, = and then the one very important was not properly merged. I can't see a = reliable solution for this; do you perhaps? Your thoughts? Thanks a lot, OC > On 16. 6. 2021, at 15:01, OCsite = wrote: >=20 > Some followup: I have analysed the R/R loops in details and found that >=20 > - the user B saved changes at 11:00:25.220 > - at this moment, there was a user A's R/R loop (which never touched = the relevant data), running from 11:00:24.570 to 11:00:25.548 > - the next user A's R/R loop started at 11:00:38.113, read the = relevant data, did not see the changes of 11:00:25.220 >=20 > In other words, > - at 11:00:25.220, user A's EC was locked; it should not merge the = changes immediately... > - ... but, at 11:00:25.548 the user A's EC was unlocked: at this = moment the 11:00:25.220 changes should be merged into the EC... > - ... which self-evidently did not happen, for in the next R/R loop = started at 11:00:38.113 the 11:00:25.220 changes were still not visible. >=20 > How could that happen? What could prevent the merge at the EC = unlock/RR loop end? >=20 > Thanks for any advice, > OC >=20 >> On 16 Jun 2021, at 6:35, OCsite > wrote: >>=20 >> Matthew, >>=20 >> no luck this way: one instance only, and one OSC = (ERXObjectStoreCoordinatorPool.maxCoordinators=3D1). >>=20 >> Thanks, >> OC >>=20 >>> On 16. 6. 2021, at 2:57, Matthew Ness = > = wrote: >>>=20 >>>=20 >>> On Wed, Jun 16, 2021, at 1:42 AM, OCsite wrote: >>>> Hi there, >>>>=20 >>>> today we have bumped into a weird behaviour which I can't grok. We = got an old/invalid value for a :1 relationship to an important object = (important enough we even added its FK to the modelled locked attributes = aside of the PK; not that it helped any, but thanks to that, we see the = current snapshot values in the database operations). >>>>=20 >>>> 1. user A stored new object into the relationship; since we log all = database operations, we can see for his EC how this happened: >>>>=20 >>>> 11:00:12.995 UPDATE on 'DBAuction' ((uid =3D 1005602) and = (lastValidPriceOfferID =3D 1061694)) -> {lastValidPriceOfferID:1061695} >>>>=20 >>>> The saveChanges operation did not fail and the appropriate R/R loop = did end without a glitch, no exception, no problem at all, at = 11:00:13.036. >>>>=20 >>>> 2. user B changed the object; again, we can see for his EC >>>>=20 >>>> 11:00:25.220 UPDATE on 'DBAuction' ((uid =3D 1005602) and = (lastValidPriceOfferID =3D 1061695)) -> {lastValidPriceOfferID:1061698} >>>>=20 >>>> The saveChanges operation again finished successfully and the = appropriate R/R loop did end all right at 11:00:25.424. >>>>=20 >>>> 3. in a new R/R loop which started at 11:00:40.947, user A read the = value (the very standard way through = storedValueForKey('lastValidPriceOffer') =E2=80=94 which is the one we = observe all the time, modelled as a simple :1 relationship with FK = lastValidPriceOfferID) =E2=80=94, and got the stale object PK:1061695!=20= >>>>=20 >>>> How is it possible that the step 2 did not update values in user = A's EC? Incidentally, to make extra sure we do not get stale values even = if R/R threads happen to overlap, the code uses in the step 3 looks like = this: >>>>=20 >>>> synchronized (alock) { // lock for the DBAuction 1005602 object = (conceptually on PK to lock out all users regardless their ECs) >>>> try { >>>> auction.editingContext().unlock() // so as to process = recent changes from other ECs (thanks, Chuck!) >>>> auction.editingContext().lock() // of course we have = to keep it locked >>>> DBPriceOffer wins=3Dauction.lastValidPriceOffer() // = simply calls storedValueForKey('lastValidPriceOffer') >>>> println "... last valid $wins" >>>>=20 >>>> and this very log shows we have got the old price offer, the one = with PK:1061695. >>>>=20 >>>> I could understand this if user A changed the object to 1061695 and = did not save it (in which case the EC sync would not touch the change; = but then it would be all right, if the change was not saved), but he = very definitely did save the 1061695 successfully in step 1, and never = changed it later (until he got it again =E2=80=94 the old/invalid value! = =E2=80=94 in step 3). >>>>=20 >>>> Can anyone see a scenario which would lead to him seeing still = 1061695 in step 3, regardless that >>>> - he did save it successfully at 11:00:12 and never changed it = later; >>>> - another user did change it to 1061698 and successfully did save = at 11:00:25; >>>> - aside of the extra lock/unlock shown above, there is no manual EC = locking, ECs are auto-locked at R/Rs; >>>> - and besides, even if the extra lock/unlock was not used, the R/R = threads here did not overlap, so the change should be synced all right, = and the user should get 1061698 in the step 3? >>>>=20 >>>> What do I overlook? >>>>=20 >>>> Thanks and all the best, >>>> OC >>>>=20 >>>=20 >>> Hi, >>>=20 >>> How many instances of the application aretypically running? >>>=20 >>> Regards, >>>=20 >>> --=20 >>> Matt >>> https://logicsquad.net >>> https://www.linkedin.com/company/logic-squad/ = --Apple-Mail=_C33A518F-979D-4DEC-A2C5-2EA4C90CA902 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Given= it did happen =E2=80=94 for whichever weird and arcane reason =E2=80=94 = at least once that after being unlocked/locked an EC still did not have changes saved previously (before the unlock/lock), = I am considering to add this work-around for the one very important = attribute (well, :1 relationship, but that's essentially same = thing):

- = whenever its value is being read from an eo, I also get its value from = the snapshot[1] and compare them
- if they match, I = simply use the value, it's definitely all right
- = if they do not match, I'll check whether the eo is in the = updatedObjects of its EC. If so, I'll again use the current value, it's = probably all right[2]
- if they do not match and the = eo is not an updated one, I simply refault it.

This should
-= cost essentially nothing in the most common case (access to snapshot is = very fast)
- fix the problem =E2=80=94 should it = happen again =E2=80=94 with a very high probability[2]
- cost little even in the worst case (refault + re-read from = snapshot is still pretty fast)

Do I overlook something of = importance?

[1] = Still not sure which one and how all those beasts differ, but I = guess EODatabaseContext.snapshotForGlobalID should be most = reliable. Hopefully.

[2] There's some danger another = attribute/relationship has been changed, and then the one very = important was not properly merged. I can't see a reliable solution for = this; do you perhaps?

Your thoughts?

Thanks a lot,
OC

On 16. = 6. 2021, at 15:01, OCsite <webobjects-dev@wocommunity.org> wrote:

Some followup: I have = analysed the R/R loops in details and found that

- the user B saved changes at = 11:00:25.220
- at this moment, there was a user A's = R/R loop (which never touched the relevant data), running = from 11:00:24.570 to 11:00:25.548
- the = next user A's R/R loop started at 11:00:38.113, read the = relevant data, did not see the changes = of 11:00:25.220

In other words,
- at 11:00:25.220, user = A's EC was locked; it should not merge the changes = immediately...
- ... but, at 11:00:25.548 the user = A's EC was unlocked: at this moment the 11:00:25.220 changes should be = merged into the EC...
- ... which self-evidently = did not happen, for in the next R/R loop started at 11:00:38.113 = the 11:00:25.220 changes were still not visible.

How could that happen? What could = prevent the merge at the EC unlock/RR loop end?

Thanks for any advice,
OC

On 16 = Jun 2021, at 6:35, OCsite <webobjects-dev@wocommunity.org> wrote:

Matthew,

no luck this way: one instance only, and one OSC = (ERXObjectStoreCoordinatorPool.maxCoordinators=3D1).

Thanks,
OC

On 16. = 6. 2021, at 2:57, Matthew Ness <webobjects-dev@wocommunity.org> wrote:


On Wed, Jun 16, 2021, at = 1:42 AM, OCsite wrote:
Hi there,

today we have bumped into a weird = behaviour which I can't grok. We got an old/invalid value for a :1 = relationship to an important object (important enough we even added its = FK to the modelled locked attributes aside of the PK; not that it helped = any, but thanks to that, we see the current snapshot values in the = database operations).

1. user A stored new object into the = relationship; since we log all database operations, we can see for his = EC how this happened:

11:00:12.995 = UPDATE on 'DBAuction' ((uid =3D 1005602) and (lastValidPriceOfferID =3D = 1061694)) -> {lastValidPriceOfferID:1061695}

The saveChanges operation did not fail and = the appropriate R/R loop did end without a glitch, no exception, no = problem at all, at 11:00:13.036.

2. user B changed = the object; again, we can see for his EC

11:00:25.220 UPDATE on 'DBAuction' ((uid =3D = 1005602) and (lastValidPriceOfferID =3D 1061695)) -> = {lastValidPriceOfferID:1061698}

The saveChanges operation again finished = successfully and the appropriate R/R loop did end all right = at 11:00:25.424.

3. in a new R/R loop which started = at 11:00:40.947, user A read the value (the very standard way = through storedValueForKey('lastValidPriceOffer') =E2=80=94 which is the one = we observe all the time, modelled as a simple :1 relationship with = FK lastValidPriceOfferID) =E2=80=94, and got the = stale object PK:1061695

How is it possible = that the step 2 did not update values in user A's EC? Incidentally, to = make extra sure we do not get stale values even if R/R threads happen to = overlap, the code uses in the step 3 looks like this:

  =   synchronized (alock) = { // lock for the DBAuction 1005602 object (conceptually on PK to lock out all users = regardless their ECs)
    =     try {
    =         auction.editingContext().unlock() // so as to process recent changes = from other ECs (thanks, Chuck!)
    =         auction.editingContext().lock() =   // of = course we have to keep it locked
            = DBPriceOffer = wins=3Dauction.lastValidPriceOffer() // simply calls storedValueForKey('lastValidPriceOffer')
        =     println "... last valid = $wins"

and this very log shows we have got = the old price offer, the one with PK:1061695.

I could understand = this if user A changed the object to 1061695 and did not = save it (in which = case the EC sync would not touch the change; but then it would be all = right, if the change was not saved), but he very definitely did save = the 1061695 successfully in step 1, and never changed it later = (until he got it again =E2=80=94 the old/invalid value! =E2=80=94 in = step 3).

Can anyone see a scenario which would lead to him seeing = still 1061695 in step 3, regardless that
- he did save it successfully at 11:00:12 and never = changed it later;
- another user = did change it to 1061698 and successfully did save = at 11:00:25;
- aside of the = extra lock/unlock shown above, there is no manual EC locking, ECs are = auto-locked at R/Rs;
- and = besides, even if the extra lock/unlock was not used, the R/R = threads here did not overlap, so the change should be synced all right, and the user should get 1061698 in the step 3?

What do I overlook?

Thanks and all the = best,
OC


Hi,

How = many instances of the application aretypically running?

Regards,

-- 
Matt


= --Apple-Mail=_C33A518F-979D-4DEC-A2C5-2EA4C90CA902--