Quantcast

[PATCH 1 of 2] fsmonitor: acquire localrepo.wlock prior to emitting hg.update state

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH 1 of 2] fsmonitor: acquire localrepo.wlock prior to emitting hg.update state

Wez Furlong-2
# HG changeset patch
# User Wez Furlong <[hidden email]>
# Date 1495136887 25200
#      Thu May 18 12:48:07 2017 -0700
# Node ID f4b76c106f3af915667db7696e27780601e93074
# Parent  8a87bfc5bebbbe0ac996ac8e047a029eb931af45
fsmonitor: acquire localrepo.wlock prior to emitting hg.update state

we see some weird things in the watchman logs where the mercurial
process is seemingly confused about which hg.update state it is publishing
through watchman.

On closer examination, we're seeing conflicting pids for the clients involved
and this implies a race.

To resolve this, we extend the wlock around the state-enter/state-leave
events that are emitted to watchman.

Test Plan:
Some manual testing:

In one window, run this, and then checkout a different rev:

```
$ watchman -p -j <<<'["subscribe", "/data/users/wez/fbsource", "wez", {"expression": ["name", ".hg/updatestate"]}]'
{
    "version": "4.9.0",
    "subscribe": "wez",
    "clock": "c:1495034090:814028:1:312576"
}
{
    "state-enter": "hg.update",
    "version": "4.9.0",
    "clock": "c:1495034090:814028:1:312596",
    "unilateral": true,
    "subscription": "wez",
    "metadata": {
        "status": "ok",
        "distance": 125,
        "rev": "a1275d79ffa6c58b53116c8ec401c275ca6c1e2a",
        "partial": false
    },
    "root": "/data/users/wez/fbsource"
}
{
    "root": "/data/users/wez/fbsource",
    "metadata": {
        "status": "ok",
        "distance": 125,
        "rev": "a1275d79ffa6c58b53116c8ec401c275ca6c1e2a",
        "partial": false
    },
    "subscription": "wez",
    "unilateral": true,
    "version": "4.9.0",
    "clock": "c:1495034090:814028:1:312627",
    "state-leave": "hg.update"
}
```

Tailed the watchman log file and looked for invalid state assertion errors,
then ran my `rebase-all` script to update/rebase all of my heads.

Didn't trigger the error condition (but couldn't reliably trigger it previously
anyway), and the output captured above shows that the states are being emitted
correctly.

diff --git a/hgext/fsmonitor/__init__.py b/hgext/fsmonitor/__init__.py
--- a/hgext/fsmonitor/__init__.py
+++ b/hgext/fsmonitor/__init__.py
@@ -600,14 +600,25 @@
         self.node = node
         self.distance = distance
         self.partial = partial
+        self._lock = None
 
     def __enter__(self):
+        # We explicitly need to take a lock here, before we proceed to update
+        # watchman about the update operation, so that we don't race with
+        # some other actor.  merge.update is going to take the wlock almost
+        # immediately anyway, so this is effectively extending the lock
+        # around a couple of short sanity checks.
+        self._lock = self.repo.wlock()
         self._state('state-enter')
         return self
 
     def __exit__(self, type_, value, tb):
-        status = 'ok' if type_ is None else 'failed'
-        self._state('state-leave', status=status)
+        try:
+            status = 'ok' if type_ is None else 'failed'
+            self._state('state-leave', status=status)
+        finally:
+            if self._lock:
+                self._lock.release()
 
     def _state(self, cmd, status='ok'):
         if not util.safehasattr(self.repo, '_watchmanclient'):
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH 2 of 2] fsmonitor: don't attempt state-leave if we didn't state-enter

Wez Furlong-2
# HG changeset patch
# User Wez Furlong <[hidden email]>
# Date 1495136950 25200
#      Thu May 18 12:49:10 2017 -0700
# Node ID 726d02dfef6afc7030243cdbd43a4fc2ffb8be86
# Parent  f4b76c106f3af915667db7696e27780601e93074
fsmonitor: don't attempt state-leave if we didn't state-enter

The state-enter command may not have been successful; for example, the watchman
client session may have timed out if the user was busy/idle for a long period
during a merge conflict resolution earlier in processing a rebase for a stack
of diffs.

It's cleaner (from the perspective of the watchman logs) to avoid issuing the
state-leave command in these cases.

Test Plan:
ran

`hg rebase --tool :merge -r '(draft() & date(-14)) - master::' -d master`

and didn't observe any errors in the watchman logs or in the output from

`watchman -p -j <<<'["subscribe", "/data/users/wez/fbsource", "wez", {"expression": ["name", ".hg/updatestate"]}]'`

diff --git a/hgext/fsmonitor/__init__.py b/hgext/fsmonitor/__init__.py
--- a/hgext/fsmonitor/__init__.py
+++ b/hgext/fsmonitor/__init__.py
@@ -601,6 +601,7 @@
         self.distance = distance
         self.partial = partial
         self._lock = None
+        self.need_leave = False
 
     def __enter__(self):
         # We explicitly need to take a lock here, before we proceed to update
@@ -609,20 +610,21 @@
         # immediately anyway, so this is effectively extending the lock
         # around a couple of short sanity checks.
         self._lock = self.repo.wlock()
-        self._state('state-enter')
+        self.need_leave = self._state('state-enter')
         return self
 
     def __exit__(self, type_, value, tb):
         try:
-            status = 'ok' if type_ is None else 'failed'
-            self._state('state-leave', status=status)
+            if self.need_leave:
+                status = 'ok' if type_ is None else 'failed'
+                self._state('state-leave', status=status)
         finally:
             if self._lock:
                 self._lock.release()
 
     def _state(self, cmd, status='ok'):
         if not util.safehasattr(self.repo, '_watchmanclient'):
-            return
+            return False
         try:
             commithash = self.repo[self.node].hex()
             self.repo._watchmanclient.command(cmd, {
@@ -637,10 +639,12 @@
                     # whether the working copy parent is changing
                     'partial': self.partial,
             }})
+            return True
         except Exception as e:
             # Swallow any errors; fire and forget
             self.repo.ui.log(
                 'watchman', 'Exception %s while running %s\n', e, cmd)
+            return False
 
 # Bracket working copy updates with calls to the watchman state-enter
 # and state-leave commands.  This allows clients to perform more intelligent
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH 2 of 2] fsmonitor: don't attempt state-leave if we didn't state-enter

Augie Fackler-2
On Thu, May 18, 2017 at 02:22:03PM -0700, Wez Furlong wrote:
> # HG changeset patch
> # User Wez Furlong <[hidden email]>
> # Date 1495136950 25200
> #      Thu May 18 12:49:10 2017 -0700
> # Node ID 726d02dfef6afc7030243cdbd43a4fc2ffb8be86
> # Parent  f4b76c106f3af915667db7696e27780601e93074
> fsmonitor: don't attempt state-leave if we didn't state-enter

queued, thanks
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Loading...