Understanding and modifying the default diff for commits

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding and modifying the default diff for commits

Andres Sommerhoff
Hi all, I want to intervene the diff operation used by the mercurial commit. I want to collect only the meaningful changes a heavy directory tree full of XML files (this to make easier to audit what really has changed, but also saving some disk space by doing so doesn’t hurt). I was looking in internet and some mercurial add that could help, put I was unsuccessfully, so thinking to make my own extension (or maybe some scripting in pre-commit hook).

I will appreciate any advice where to start my intervention of the diff process during the commit of mercurial if I go to make my own extension? Any help for locating the diff code that is used by mercurial (to look and learn how is the interaction with it)?

If you are curious about the problem I’m trying to deal with it, it is the software KNIME that the projects (scientific models) developed in that software is saved in several XML files, where each XML represent a small portion of the model (“nodes” as called in KNIME). One project can easily have more than 500 nodes (-> XML files). If I change a single node and save the project, then not only the single related file is changes but all the 500 XML files are also updated. Inside each XML file the “last modification date” and “last author” is changed.

I’m looking to skip all the files that the single change was updating “last modification date” and “last author” but nothing else. By doing so, I can focus in the important changes, making easy to audit the manful modifications, merges can be far less cumbersome, and the history much cleaner when making a log on a specific file.

Maybe a simple command line option for the commit is the solution, but I see no official option in the commit command in order to use an alternative diff tool for calculating the patches. On the other hand, as far I read, the “extdiff” option only affect the comparison of the revisions, but not for commit process (maybe I’m wrong on this last sentence). Or maybe a commit command line using the function “diff([includepattern [, excludepattern]]):” in conjunction with “--exclude “ will make the magic I’m looking for, but I couldn’t figure it out yet.

I’m on Windows 10, using Mercurial and TortoiseHg 5.0.2.

Regards, Andres

_______________________________________________
Mercurial mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial
Reply | Threaded
Open this post in threaded view
|

RE: Understanding and modifying the default diff for commits

Becker, Mischa J-2

Have you done any testing to see if KNIME needs those two fields?  i.e. If you delete them from the XML files does it cause problems or does KNIME just re-create them the next time you save?

 

If they aren't actually needed, the simplest thing to do would be to write a script that strips those fields from your XML files.  No need to modify the diff.  The one time I did this, instead of creating a hook, I just committed the script to my repo and ran it manually when needed.

 

Mischa Becker

 

From: Mercurial <[hidden email]> On Behalf Of Andres Sommerhoff
Sent: Sunday, June 7, 2020 4:45 PM
To: [hidden email]
Subject: Understanding and modifying the default diff for commits

 

** [EXTERNAL EMAIL]: Do not click links or open attachments unless you recognize the sender and know the content is safe. **

Hi all, I want to intervene the diff operation used by the mercurial commit. I want to collect only the meaningful changes a heavy directory tree full of XML files (this to make easier to audit what really has changed, but also saving some disk space by doing so doesn’t hurt). I was looking in internet and some mercurial add that could help, put I was unsuccessfully, so thinking to make my own extension (or maybe some scripting in pre-commit hook).

I will appreciate any advice where to start my intervention of the diff process during the commit of mercurial if I go to make my own extension? Any help for locating the diff code that is used by mercurial (to look and learn how is the interaction with it)?

If you are curious about the problem I’m trying to deal with it, it is the software KNIME that the projects (scientific models) developed in that software is saved in several XML files, where each XML represent a small portion of the model (“nodes” as called in KNIME). One project can easily have more than 500 nodes (-> XML files). If I change a single node and save the project, then not only the single related file is changes but all the 500 XML files are also updated. Inside each XML file the “last modification date” and “last author” is changed.

I’m looking to skip all the files that the single change was updating “last modification date” and “last author” but nothing else. By doing so, I can focus in the important changes, making easy to audit the manful modifications, merges can be far less cumbersome, and the history much cleaner when making a log on a specific file.

Maybe a simple command line option for the commit is the solution, but I see no official option in the commit command in order to use an alternative diff tool for calculating the patches. On the other hand, as far I read, the “extdiff” option only affect the comparison of the revisions, but not for commit process (maybe I’m wrong on this last sentence). Or maybe a commit command line using the function “diff([includepattern [, excludepattern]]):” in conjunction with “--exclude “ will make the magic I’m looking for, but I couldn’t figure it out yet.

I’m on Windows 10, using Mercurial and TortoiseHg 5.0.2.

Regards, Andres




This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is confidential and protected by law from unauthorized disclosure. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

_______________________________________________
Mercurial mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial
Reply | Threaded
Open this post in threaded view
|

Re: Understanding and modifying the default diff for commits

Andres Sommerhoff
Thank you Mischa, unfortunately the software warns the user that the field is missing. A screenshot of a example warning is attached.

image.png

It's is just a complain, no more than that, as the model was loaded and working anyway. However, I will investigate a bit further to check another alternative to avoid the annoying loading message (maybe a hook or a simple script as you suggested)

Thank you again for your answer!

Andres Sommerhoff




On Tue, Jun 9, 2020 at 7:07 PM Becker, Mischa J <[hidden email]> wrote:

Have you done any testing to see if KNIME needs those two fields?  i.e. If you delete them from the XML files does it cause problems or does KNIME just re-create them the next time you save?

 

If they aren't actually needed, the simplest thing to do would be to write a script that strips those fields from your XML files.  No need to modify the diff.  The one time I did this, instead of creating a hook, I just committed the script to my repo and ran it manually when needed.

 

Mischa Becker

 

From: Mercurial <[hidden email]> On Behalf Of Andres Sommerhoff
Sent: Sunday, June 7, 2020 4:45 PM
To: [hidden email]
Subject: Understanding and modifying the default diff for commits

 

** [EXTERNAL EMAIL]: Do not click links or open attachments unless you recognize the sender and know the content is safe. **

Hi all, I want to intervene the diff operation used by the mercurial commit. I want to collect only the meaningful changes a heavy directory tree full of XML files (this to make easier to audit what really has changed, but also saving some disk space by doing so doesn’t hurt). I was looking in internet and some mercurial add that could help, put I was unsuccessfully, so thinking to make my own extension (or maybe some scripting in pre-commit hook).

I will appreciate any advice where to start my intervention of the diff process during the commit of mercurial if I go to make my own extension? Any help for locating the diff code that is used by mercurial (to look and learn how is the interaction with it)?

If you are curious about the problem I’m trying to deal with it, it is the software KNIME that the projects (scientific models) developed in that software is saved in several XML files, where each XML represent a small portion of the model (“nodes” as called in KNIME). One project can easily have more than 500 nodes (-> XML files). If I change a single node and save the project, then not only the single related file is changes but all the 500 XML files are also updated. Inside each XML file the “last modification date” and “last author” is changed.

I’m looking to skip all the files that the single change was updating “last modification date” and “last author” but nothing else. By doing so, I can focus in the important changes, making easy to audit the manful modifications, merges can be far less cumbersome, and the history much cleaner when making a log on a specific file.

Maybe a simple command line option for the commit is the solution, but I see no official option in the commit command in order to use an alternative diff tool for calculating the patches. On the other hand, as far I read, the “extdiff” option only affect the comparison of the revisions, but not for commit process (maybe I’m wrong on this last sentence). Or maybe a commit command line using the function “diff([includepattern [, excludepattern]]):” in conjunction with “--exclude “ will make the magic I’m looking for, but I couldn’t figure it out yet.

I’m on Windows 10, using Mercurial and TortoiseHg 5.0.2.

Regards, Andres




This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is confidential and protected by law from unauthorized disclosure. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

_______________________________________________
Mercurial mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial