hgext.notify as a changegroup hook generates a diff between wrong versions

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

hgext.notify as a changegroup hook generates a diff between wrong versions

Valentin David
Hello,

It seemed that the bug tracker is down so I cannot search whether that
is a known issue or not. I hope it is not.

When hgext.notify gets a changegroup, it generates a diff between p1
of the first node of the changegroup to the new tip.

I think what is intended is the old tip to the new tip.

OK, let's say the first approach is called "approach A", and the
second one "approach B". I will now show some uses cases. In *all* cases
revisions 0, 1 and 2 are already on the repository and 3 is the first
revision of the changegroup. And of course the changegroup is ordered by
revision numbers.

(Please view it with a monospaced font)

_Case 1_

  5
 /|
4 2
| |
3 1
 \|
  0

Approach A: 0::5
Approach B: 2::5

We note, that approach B will show all the modifications brought to
the repository, but not what is already here. Whereas approach A shows
the diff of that new "branch".

You might still think that approach A is the right one, but I have
more cases. Here we say that 4 is a merge commit with p2 being 1.

_Case 2_

  5
 /|
4 2
|\|
3 1
 \|
  0

Approach A: 0::5
Approach B: 2::5

You notice that we included the change from 0 to 1 (and also 1 to 2)
when 0 to 1 was already merged into that .

Let's make several branches.

_Case 3_

    6
   /|
  5 |
 /| |
| 3 2
 \ \|
  4 1
   \|
    0

Yes, it is weird, but it is all valid.

Approach A: 1::6
Approach B: 2::6

Where is 0 to 1? It disappeared, wizzzzzzz! Though 1 to 2 did not
disappear. Approach B however is more consistent since it does not
show none of those two changesets. I hope this last case convinced you.

If nobody has time to fix that, but the maintainers think the bug is
valid, I can look at making a patch.

--
Valentin David


_______________________________________________
Mercurial mailing list
[hidden email]
http://selenic.com/mailman/listinfo/mercurial
Reply | Threaded
Open this post in threaded view
|

Re: hgext.notify as a changegroup hook generates a diff between wrong versions

Matt Mackall
On Mon, 2012-06-11 at 18:01 +0200, Valentin David wrote:
> Hello,
>
> It seemed that the bug tracker is down so I cannot search whether that
> is a known issue or not. I hope it is not.
>
> When hgext.notify gets a changegroup, it generates a diff between p1
> of the first node of the changegroup to the new tip.
>
> I think what is intended is the old tip to the new tip.

No, definitely not. Anything that assumes the old tip is relevant is
almost certainly broken for multiple heads or multiple branches. Here,
this will give you a diff between your new branch head and whichever
random branch was pushed to last.

(Generally 'tip' is a concept best erased from the vocabulary of a
modern Hg user, as it invites these sorts of conceptual errors.)

> OK, let's say the first approach is called "approach A", and the
> second one "approach B". I will now show some uses cases. In *all* cases
> revisions 0, 1 and 2 are already on the repository and 3 is the first
> revision of the changegroup. And of course the changegroup is ordered by
> revision numbers.
>
> (Please view it with a monospaced font)
>
> _Case 1_
>
>   5
>  /|
> 4 2
> | |
> 3 1
>  \|
>   0

FYI, we have a convention for drawing graphs:

http://mercurial.selenic.com/wiki/DrawingTextDAGs

> Approach A: 0::5
> Approach B: 2::5

Consider this:

0-1-3-5
 \
  2-4

3 through 5 were pushed. What should the notification diff for this push
look like?

There is, of course, no remotely meaningful answer possible. Diff is
fundamentally ill-suited to handling anything but a linear change. Which
is why notify does the most obvious thing: assume you're diffing a
linear change.

I think anyone who's looking at aggregate diffs from notify is in a
state of conceptual sin similar to trying to divide by zero.

--
Mathematics is the supreme nostalgia of our time.


_______________________________________________
Mercurial mailing list
[hidden email]
http://selenic.com/mailman/listinfo/mercurial
Reply | Threaded
Open this post in threaded view
|

Re: hgext.notify as a changegroup hook generates a diff between wrong versions

Valentin David
On Mon, 2012-06-11 at 18:26 -0500, Matt Mackall wrote:

> On Mon, 2012-06-11 at 18:01 +0200, Valentin David wrote:
> > Hello,
> >
> > It seemed that the bug tracker is down so I cannot search whether that
> > is a known issue or not. I hope it is not.
> >
> > When hgext.notify gets a changegroup, it generates a diff between p1
> > of the first node of the changegroup to the new tip.
> >
> > I think what is intended is the old tip to the new tip.
>
> No, definitely not. Anything that assumes the old tip is relevant is
> almost certainly broken for multiple heads or multiple branches. Here,
> this will give you a diff between your new branch head and whichever
> random branch was pushed to last.

Of course, you are right. I am sorry, I am quite new in using
Mercurial.

So I propose something else: for each new head, we look for the maximum
revision number of ancestors already in the repository. And we do a diff
from that node to the respective head.

> Consider this:
>
> 0-1-3-5
>  \
>   2-4
>
> 3 through 5 were pushed. What should the notification diff for this push
> look like?

Right, on my approach it would have been 2 to 5 which would make little
sense. I was thinking too much of my use case where my pedantic hooks
would actually prevent this to happen.

On what there is now in hgext.notify, it would get just a diff between 1
and 5. It is a bug, right?

If we accepted to have several diffs as I now propose, we would then get
two diffs: 1 to 5 and 2 to 4. I think that would make more sense.

> There is, of course, no remotely meaningful answer possible. Diff is
> fundamentally ill-suited to handling anything but a linear change. Which
> is why notify does the most obvious thing: assume you're diffing a
> linear change.

But it is obviously broken when there is a merge or multiple heads
created in the changegroup. I would then like at least to be able to
disable diff on those conditions and only those.

> I think anyone who's looking at aggregate diffs from notify is in a
> state of conceptual sin similar to trying to divide by zero.

The feature is there, and it should be supported or discontinued. But
you cannot keep a broken feature.

I find aggregating commits in one email better for my use case. We have
one repository per branch. They all send notification to a mailing list
because we want to monitor all branches. However, we do not want to see
50 (or even more) new emails for commits that were already notified from
a branch before.

While the diff generated when a linear strand of commits is pushed makes
sense, the diff on a merge is real perturbation and I would like to see
it gone or fixed.

So to sum up a bit, here are the approaches I would prefer:
 - If a merge or several new heads are detected, the diff is not
generated. Otherwise we keep the diff like it is implemented.
 - If several heads are detected, the diff is not generated. Otherwise
we make a diff to the new head from the max revision of what is already
on the repository that is an ancestor of that new head.
 - For each new head, we make a diff like on the previous point. We just
keep the diffs separated.
 - We do not aggregate the diff, instead we put a diff for each commit.

I hope that at least one of those approaches make sense.

--
Valentin David


_______________________________________________
Mercurial mailing list
[hidden email]
http://selenic.com/mailman/listinfo/mercurial
Reply | Threaded
Open this post in threaded view
|

Re: hgext.notify as a changegroup hook generates a diff between wrong versions

Matt Mackall
On Tue, 2012-06-12 at 12:23 +0200, Valentin David wrote:
> I think anyone who's looking at aggregate diffs from notify is in a
> > state of conceptual sin similar to trying to divide by zero.
>
> The feature is there, and it should be supported or discontinued. But
> you cannot keep a broken feature.

Let me try this again: when someone types '1 ÷ 0' into their calculator,
is it their calculator that's "broken"? Should the '÷' feature be
discontinued or properly supported? If you raise this issue in a
calculator BTS, you can be sure it will be marked "invalid / wontfix".

There are currently two ways for notify to work:

- one diff per changeset
- one diff total

The latter, as demonstrated above, cannot always give a meaningful
answer because diff _definitionally_ can't handle them.

We can't redefine diff though, nor should we redefine "one diff total"
to start meaning "multiple diffs".

Fortunately, we can add new buttons to our calculator. Feel free to
propose a brand new button.

> So to sum up a bit, here are the approaches I would prefer:
>  - If a merge or several new heads are detected, the diff is not
> generated. Otherwise we keep the diff like it is implemented.
>  - If several heads are detected, the diff is not generated. Otherwise
> we make a diff to the new head from the max revision of what is already
> on the repository that is an ancestor of that new head.
>  - For each new head, we make a diff like on the previous point. We just
> keep the diffs separated.
>  - We do not aggregate the diff, instead we put a diff for each commit.

We should probably have a mode that sends a diff per commit in the same
email. And we should probably have a mode that ignores merges while
doing that.

--
Mathematics is the supreme nostalgia of our time.


_______________________________________________
Mercurial mailing list
[hidden email]
http://selenic.com/mailman/listinfo/mercurial
Reply | Threaded
Open this post in threaded view
|

Re: hgext.notify as a changegroup hook generates a diff between wrong versions

Valentin David
On Tue, 2012-06-12 at 11:20 -0500, Matt Mackall wrote:

> On Tue, 2012-06-12 at 12:23 +0200, Valentin David wrote:
> > I think anyone who's looking at aggregate diffs from notify is in a
> > > state of conceptual sin similar to trying to divide by zero.
> >
> > The feature is there, and it should be supported or discontinued. But
> > you cannot keep a broken feature.
>
> Let me try this again: when someone types '1 ÷ 0' into their calculator,
> is it their calculator that's "broken"? Should the '÷' feature be
> discontinued or properly supported? If you raise this issue in a
> calculator BTS, you can be sure it will be marked "invalid / wontfix".

Division is a partial function. There is two way to deal with the
partiality of function.

 1. You expect that nobody will ever call that. It is a precondition.
For debugging purpose you can add an assertion, but in the end you do
not really care about the behavior when input is wrong, the result can
be wrong. An example free from the C library. You need to provide some
valid pointer. What happens when you give something invalid will just
depend on the implementation.

 2. You can expect that someone will provide something erroneous. Then
you have to handle "exceptions" (it does not have to be exceptions from
the programming language). If you call "open" (system call) for example
on a file that does not exist, it will have to return an error. Could
you test that the file exists before? Sure, but that does not help
because someone might have deleted it between the two calls.

The case of the calculator is obviously case 2. I will never buy a
calculator that prints "42" when I asks for "1/0". Or worse, if it just
starts physically burning. So yes, it can be a bug.

Pushing several heads, or having several merge commits is legal. So we
cannot have a precondition. We need to handle that case. It does not
matter how the problem is solved to me. Take the easiest one. But I do
not want to see a result that first looks meaningful, but at the end is
completely meaningless. That is a bug.

--
Valentin David


_______________________________________________
Mercurial mailing list
[hidden email]
http://selenic.com/mailman/listinfo/mercurial
Reply | Threaded
Open this post in threaded view
|

Re: hgext.notify as a changegroup hook generates a diff between wrong versions

Valentin David
In reply to this post by Matt Mackall
On Tue, 2012-06-12 at 11:20 -0500, Matt Mackall wrote:

> There are currently two ways for notify to work:
>
> - one diff per changeset
> - one diff total
>
> The latter, as demonstrated above, cannot always give a meaningful
> answer because diff _definitionally_ can't handle them.
>
> We can't redefine diff though, nor should we redefine "one diff total"
> to start meaning "multiple diffs".
>
> Fortunately, we can add new buttons to our calculator. Feel free to
> propose a brand new button.
>
> > So to sum up a bit, here are the approaches I would prefer:
> >  - If a merge or several new heads are detected, the diff is not
> > generated. Otherwise we keep the diff like it is implemented.
> >  - If several heads are detected, the diff is not generated. Otherwise
> > we make a diff to the new head from the max revision of what is already
> > on the repository that is an ancestor of that new head.
> >  - For each new head, we make a diff like on the previous point. We just
> > keep the diffs separated.
> >  - We do not aggregate the diff, instead we put a diff for each commit.
>
> We should probably have a mode that sends a diff per commit in the same
> email. And we should probably have a mode that ignores merges while
> doing that.

OK, what if I provide an implementation for mode 1, 2 and 4, accessible
by configuration, but keep the old mode if nothing is configured?

Mode 3 would be too weird to read.

--
Valentin David


_______________________________________________
Mercurial mailing list
[hidden email]
http://selenic.com/mailman/listinfo/mercurial
Reply | Threaded
Open this post in threaded view
|

Re: hgext.notify as a changegroup hook generates a diff between wrong versions

David Champion
In reply to this post by Matt Mackall
* On 12 Jun 2012, Matt Mackall wrote:
>
> There are currently two ways for notify to work:
>
> - one diff per changeset

(and each diff in a separate email)

> - one diff total
>
> ...
> Fortunately, we can add new buttons to our calculator. Feel free to
> propose a brand new button.
>
> > So to sum up a bit, here are the approaches I would prefer:
> >  - If a merge or several new heads are detected, the diff is not
> > generated. Otherwise we keep the diff like it is implemented.
> >  - If several heads are detected, the diff is not generated. Otherwise
> > we make a diff to the new head from the max revision of what is already
> > on the repository that is an ancestor of that new head.
> >  - For each new head, we make a diff like on the previous point. We just
> > keep the diffs separated.
> >  - We do not aggregate the diff, instead we put a diff for each commit.
>
> We should probably have a mode that sends a diff per commit in the same
> email. And we should probably have a mode that ignores merges while
> doing that.

We have notify.merge=False already, so let's reuse that.  A change
to send one diff per changeset in a single email -- that is, in a
changrgroup hook -- is what we can't do now.

Here is a configuration structure that permits all the options
described.  As appealing as it might seem, I'm not convinced that anyone
needs the 'head' options, but it could be done.

hooks.incoming.notify = python:hgext.notify.hook
        Send one diff per email per changeset

hooks.changegroup.notify = python:hgext.notify.hook
notify.diffmode = aggregate (default)
        Send one diff for the whole lot, in a single email

hooks.changegroup.notify = python:hgext.notify.hook
notify.diffmode = individual
notify.mime = False
        Send one diff per changeset in a single email

hooks.changegroup.notify = python:hgext.notify.hook
notify.diffmode = individual
notify.mime = True (default)
        Send one diff per changeset in a single email, each in a separate MIME part

hooks.changegroup.notify = python:hgext.notify.hook
notify.diffmode = head
notify.mime = False
        Send one diff per branch (head) in a single email

hooks.changegroup.notify = python:hgext.notify.hook
notify.diffmode = head
notify.mime = True (default)
        Send one diff per branch (head) in a single email, each in a separate MIME part

For Principle of Least Surprise compliance, if notify.mime is considered
at all for aggregate diffs, it should default to False.  But it should
probably be true for other modes.

I'd be glad to help with this, if needed.  I've had my hands in notify a
bit.

--
David Champion • [hidden email] • IT Services • University of Chicago
_______________________________________________
Mercurial mailing list
[hidden email]
http://selenic.com/mailman/listinfo/mercurial