[PATCH] who: remove OpenJDK

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] who: remove OpenJDK

David Demelier-2
# HG changeset patch
# User David Demelier <[hidden email]>
# Date 1595664656 -7200
#      Sat Jul 25 10:10:56 2020 +0200
# Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
# Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
who: remove OpenJDK

They unfortunately moved to GitHub.

https://openjdk.java.net/jeps/369

diff -r b0e3c6141a78 -r 7eaad1ed8c74 templates/who/index.html
--- a/templates/who/index.html Fri Jul 26 14:27:08 2019 +0200
+++ b/templates/who/index.html Sat Jul 25 10:10:56 2020 +0200
@@ -9,9 +9,6 @@
         <h3>Mozilla</h3>
         Mozilla is an open source project that is currently developing the popular <a href="https://www.mozilla.org/firefox">Firefox</a> internet browser, the email client <a href="https://www.mozilla.org/thunderbird">Thunderbird</a> and the application suite SeaMonkey. Mozilla chose Mercurial in 2006.</p>
         <p><a href="https://www.mozilla.org">https://www.mozilla.org</a></p>
-        <h3>Java / OpenJDK</h3>
-        OpenJDK is the official open sourced Java implementation of Sun Microsystems. When open sourcing the project, Sun chose Mercurial as their main version control system.
-        <p><a href="http://openjdk.java.net/">http://openjdk.java.net/</a></p>
         <h3>Nginx</h3>
         The nginx web server is among one of the most popular and used over the world.
         <p><a href="http://nginx.org/">http://nginx.org/</a></p>

_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Pulkit Goyal
On Sat, Jul 25, 2020 at 1:43 PM David Demelier <[hidden email]> wrote:

>
> # HG changeset patch
> # User David Demelier <[hidden email]>
> # Date 1595664656 -7200
> #      Sat Jul 25 10:10:56 2020 +0200
> # Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
> # Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
> who: remove OpenJDK
>
> They unfortunately moved to GitHub.
>
> https://openjdk.java.net/jeps/369

Queued this, many thanks!

>
> diff -r b0e3c6141a78 -r 7eaad1ed8c74 templates/who/index.html
> --- a/templates/who/index.html  Fri Jul 26 14:27:08 2019 +0200
> +++ b/templates/who/index.html  Sat Jul 25 10:10:56 2020 +0200
> @@ -9,9 +9,6 @@
>          <h3>Mozilla</h3>
>          Mozilla is an open source project that is currently developing the popular <a href="https://www.mozilla.org/firefox">Firefox</a> internet browser, the email client <a href="https://www.mozilla.org/thunderbird">Thunderbird</a> and the application suite SeaMonkey. Mozilla chose Mercurial in 2006.</p>
>          <p><a href="https://www.mozilla.org">https://www.mozilla.org</a></p>
> -        <h3>Java / OpenJDK</h3>
> -        OpenJDK is the official open sourced Java implementation of Sun Microsystems. When open sourcing the project, Sun chose Mercurial as their main version control system.
> -        <p><a href="http://openjdk.java.net/">http://openjdk.java.net/</a></p>
>          <h3>Nginx</h3>
>          The nginx web server is among one of the most popular and used over the world.
>          <p><a href="http://nginx.org/">http://nginx.org/</a></p>
>
> _______________________________________________
> Mercurial-devel mailing list
> [hidden email]
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Martin von Zweigbergk via Mercurial-devel
In reply to this post by David Demelier-2
That's sad.

Apparently OpenJDK started contemplating a migration to git one year ago
(2019-07-12): https://openjdk.java.net/jeps/357

I am reporting (an edited version of) the "motivation" section of that
ticket, because I'd like a reflection about how mercurial is perceived
"out there":


===

Motivation:

There are three primary reasons for migrating to Git:

1. Size of version control system metadata:

   a. Initial prototypes [...] show a significant reduction in
[metadata] size. For example, the .git directory of the jdk/jdk
repository is approximately 300 MB with Git and the .hg directory is
around 1.2 GB with Mercurial, depending on the Mercurial version being
used. The reduction in metadata preserves local disk space and reduces
clone times [...].

   b. Git also features shallow clones that only clone parts of the
history, resulting in even less metadata for those users who do not need
the entire history.


2. Available tooling

There are many more tools for interacting with Git than Mercurial:

    a. All text editors have Git integration, either natively or in the
form of plugins including Emacs (magit plugin), Vim (fugitive.git
plugin), VS Code (builtin), and Atom (builtin).

    b. Almost all integrated development environments (IDEs) also ship
with Git integration out-of-the-box, including IntelliJ (builtin),
Eclipse (builtin), NetBeans (builtin), and Visual Studio (builtin).

    c. There are multiple desktop clients available for interacting with
Git repositories locally.

3. Available hosting

Lastly, there are many options available for hosting Git repositories,
whether self-hosted or hosted as a service.

===

About .hg size (1a): is it really true that .hg is 1.2GB and the
corresponding .git version is 300 MB? Verifying it should not be too
difficult. If it's true (I doubt it), something has to be done.

Shallow clones (1b): I never needed that, but now I am curious: do we
have a similar feature in core or in a extension? If yes (and even if
no, really), how to better communicate that feature-wise mercurial is on
par (and sometimes better) than git?

Tooling (2): maybe git has much more, but TortoiseHG has a lot of
potential. A lot of git tools are not Free Software, too.

Hosting (3): there is Heptapod, there is Kallithea (how is it doing).
Once more, there is not enough communication IMHO.





On 25/07/20 10:11, David Demelier wrote:

> # HG changeset patch
> # User David Demelier <[hidden email]>
> # Date 1595664656 -7200
> #      Sat Jul 25 10:10:56 2020 +0200
> # Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
> # Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
> who: remove OpenJDK
>
> They unfortunately moved to GitHub.
>
> https://openjdk.java.net/jeps/369
>
> diff -r b0e3c6141a78 -r 7eaad1ed8c74 templates/who/index.html
> --- a/templates/who/index.html Fri Jul 26 14:27:08 2019 +0200
> +++ b/templates/who/index.html Sat Jul 25 10:10:56 2020 +0200
> @@ -9,9 +9,6 @@
>           <h3>Mozilla</h3>
>           Mozilla is an open source project that is currently developing the popular <a href="https://www.mozilla.org/firefox">Firefox</a> internet browser, the email client <a href="https://www.mozilla.org/thunderbird">Thunderbird</a> and the application suite SeaMonkey. Mozilla chose Mercurial in 2006.</p>
>           <p><a href="https://www.mozilla.org">https://www.mozilla.org</a></p>
> -        <h3>Java / OpenJDK</h3>
> -        OpenJDK is the official open sourced Java implementation of Sun Microsystems. When open sourcing the project, Sun chose Mercurial as their main version control system.
> -        <p><a href="http://openjdk.java.net/">http://openjdk.java.net/</a></p>
>           <h3>Nginx</h3>
>           The nginx web server is among one of the most popular and used over the world.
>           <p><a href="http://nginx.org/">http://nginx.org/</a></p>
>
> _______________________________________________
> Mercurial-devel mailing list
> [hidden email]
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Josef 'Jeff' Sipek
On Sat, Jul 25, 2020 at 12:27:42 +0200, Antonio Muci via Mercurial-devel wrote:
> That's sad.

Yeah.

This motivated me enough to clone the repos (hg and git) and collect some
data.  Maybe people here will find it useful.

First off, the clone itself.  I cloned it from the official upstream repos.
My internet connection is 150 Mbit/s, the storage is a 3-way ZFS mirror.  I
used hg 4.9.1 (py27), and git 2.21.0.  (I know, I need to update both.  This
is on a box that has a solid network connection but is harder to update.  If
there is interest I can spend the effort to update them and re-run it with
newer versions.)

$ hg clone https://hg.openjdk.java.net/jdk/jdk
destination directory: jdk
requesting all changes
adding changesets
adding manifests
adding file changes
added 60318 changesets with 516970 changes to 187542 files
new changesets fd16c54261b3:227cd01f15fa
updating to branch default
65415 files updated, 0 files merged, 0 files removed, 0 files unresolved

This took a total of ~16.3 mins (978 seconds), of which:

 1) ~30 seconds were used by "adding changesets"
 2) ~8 mins were used by "adding manifests"
 3) ~7 mins were used by "adding files"

The adding of manifests and files was receiving ~1.0-1.2 MB/s (bytes
received on the NIC, *not* actual payload inside TCP and hg specific
framing).

My box still had plenty of CPU, RAM, and I/O left so I don't know if the 1.0
MB/s was a result of hg being sub-optimal or if the hg server or the network
connection were the bottleneck.

To rule out internet slowness, I ran 'hg serve' on the clone and did a clone
on my laptop (5.5rc0+25-fbc53c5853b0, py3) on the same subnet (wifi
connected).  It took 495 seconds (2x faster), and I saw slightly higher
network utilization (~1.7 MB/s) and the laptop CPU pegged at 100% for pretty
much the entire duration of the "adding file changes" portion.  (The laptop
has an SSD, so that probably helped eliminate some of the slowness - it is a
bit of an apples and oranges comparison, but interesting none the less.)

Cloning directly from java.net on my laptop took 1400 seconds - so, about
50% slower.  This could be because of the wifi, py3 vs. py27, hg version
difference, etc., etc.


$ git clone https://github.com/openjdk/jdk.git jdk-git
Cloning into 'jdk-git'...
remote: Enumerating objects: 819, done.
remote: Counting objects: 100% (819/819), done.
remote: Compressing objects: 100% (577/577), done.
remote: Total 1072595 (delta 356), reused 423 (delta 199), pack-reused 1071776
Receiving objects: 100% (1072595/1072595), 414.42 MiB | 6.17 MiB/s, done.
Resolving deltas: 100% (800673/800673), done.
Checking out files: 100% (65415/65415), done.

This took a total of 1 min 49 secs (109 seconds), of which:

 1) 1 min 8 secs were used by "receiving objects"
 2) 25 seconds were used by "resolving deltas"

The receiving of objects was pulling in 6.8 MB/s.

Cloning directly on my laptop took 99 seconds with git version 2.26.2.

...
> About .hg size (1a): is it really true that .hg is 1.2GB and the
> corresponding .git version is 300 MB? Verifying it should not be too
> difficult. If it's true (I doubt it), something has to be done.

$ du -shA jdk-*/.{hg,git}
1.10G   jdk-hg/.hg
452M    jdk-git/.git

So, both numbers seem to be tweaked to justify migration - at least on a
fresh clone - but I'd say hg is worse by 2-3x.

The whole checkout in case anyone cares:

$ du -shA *
1014M   jdk-git
1.65G   jdk-hg

Now, hg specifics.  It looks like the manifest is huge.  This corresponds to
how long it took to download.

-rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 25 12:16 00changelog.d
-rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 25 12:01 00changelog.i
-rw-r--r--   1 jeffpc   jeffpc      434M Jul 25 12:09 00manifest.d
-rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 25 12:09 00manifest.i

Not a complete surprised given that there are a lot of files (~65k) tracked
and many use the super-long file paths (e.g.,
test/hotspot/jtreg/runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java).
That adds up.  Just the paths in the manifest itself add up to almost 4.7MB.

$ hg manifest | wc
   65415   65415 4694467

I'm guessing that they would have benefited from treemanifest.


I also tried to clone locally to see what sort of thing a user would see.

$ hg clone jdk-hg test
$ git clone jdk-git test-git

hg took 60 seconds (with hot cache, ~120 secs cold cache), git took 13
seconds.  Git hardlinked the one big pack file, while hg hardlinked each of
the file in .hg/store.  Obviosly, hardlinking 2 files is much faster than
hardlinking ~180k.  (treemanifest would have made this even worse for hg.)


I just kicked off a conversion to treemanifest.  It'll take a while.

Jeff.

--
Intellectuals solve problems; geniuses prevent them
                - Albert Einstein
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Joerg Sonnenberger
On Sat, Jul 25, 2020 at 01:36:32PM -0400, Josef 'Jeff' Sipek wrote:
> First off, the clone itself.  I cloned it from the official upstream repos.
> My internet connection is 150 Mbit/s, the storage is a 3-way ZFS mirror.  I
> used hg 4.9.1 (py27), and git 2.21.0.  (I know, I need to update both.  This
> is on a box that has a solid network connection but is harder to update.  If
> there is interest I can spend the effort to update them and re-run it with
> newer versions.)

It should be noted that for all intends and purposes, a git clone is
much more comparable to hg clone --stream.

> Now, hg specifics.  It looks like the manifest is huge.  This corresponds to
> how long it took to download.
>
> -rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 25 12:16 00changelog.d
> -rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 25 12:01 00changelog.i
> -rw-r--r--   1 jeffpc   jeffpc      434M Jul 25 12:09 00manifest.d
> -rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 25 12:09 00manifest.i

I have similar reservations about the way manifests are handled for the
NetBSD repository. It's been a topic of discussion recently on IRC. The
manifest processing itself currently takes nearly half of the total
clone time and that looks ...suspicious at best.


> I'm guessing that they would have benefited from treemanifest.

From my testing, treemanifests don't help at all.

> I also tried to clone locally to see what sort of thing a user would see.
>
> $ hg clone jdk-hg test
> $ git clone jdk-git test-git
>
> hg took 60 seconds (with hot cache, ~120 secs cold cache), git took 13
> seconds.  Git hardlinked the one big pack file, while hg hardlinked each of
> the file in .hg/store.  Obviosly, hardlinking 2 files is much faster than
> hardlinking ~180k.  (treemanifest would have made this even worse for hg.)

Using a unified storage would help somewhat in general, but I don't
consider local clone a big use case. share serves the purpose generally
much better.

> I just kicked off a conversion to treemanifest.  It'll take a while.

Did you convert to generaldelta and etc already?

Joerg
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Josef 'Jeff' Sipek
On Sun, Jul 26, 2020 at 04:11:06 +0200, Joerg Sonnenberger wrote:

> On Sat, Jul 25, 2020 at 01:36:32PM -0400, Josef 'Jeff' Sipek wrote:
> > First off, the clone itself.  I cloned it from the official upstream repos.
> > My internet connection is 150 Mbit/s, the storage is a 3-way ZFS mirror.  I
> > used hg 4.9.1 (py27), and git 2.21.0.  (I know, I need to update both.  This
> > is on a box that has a solid network connection but is harder to update.  If
> > there is interest I can spend the effort to update them and re-run it with
> > newer versions.)
>
> It should be noted that for all intends and purposes, a git clone is
> much more comparable to hg clone --stream.

I don't know if this is a temporary error or if the java.net server
disallows it, but:

$ hg clone --stream https://hg.openjdk.java.net/jdk/jdk jdk-stream
streaming all changes
abort: locking the remote repository failed

It'd make sense for this to be a disabled by policy, because you don't want
someone doing a slow streaming pull to lock the server's repo for hours
preventing other pushes (assuming that's the same lock).


Doing the clone over the LAN (gigabit ethernet) took 1m26s total (including
the checkout):

$ hg clone --stream http://server-host:8000 test-hg
streaming all changes
187754 files to transfer, 1.07 GB of data
transferred 1.07 GB in 45.5 seconds (24.0 MB/sec)
updating to branch default
65415 files updated, 0 files merged, 0 files removed, 0 files unresolved

The client host was running at 99% CPU while receiving the data, while the
server was at around 80-90%.  So, I'm concluding that in this local case I
was CPU bound on the client, but the server wasn't exactly lightly loaded.

For comparison, git cloning (including checkout) over the same LAN took 60
seconds.  So, faster than hg streaming clone, but only by ~26 seconds.

> > Now, hg specifics.  It looks like the manifest is huge.  This corresponds to
> > how long it took to download.
> >
> > -rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 25 12:16 00changelog.d
> > -rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 25 12:01 00changelog.i
> > -rw-r--r--   1 jeffpc   jeffpc      434M Jul 25 12:09 00manifest.d
> > -rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 25 12:09 00manifest.i
>
> I have similar reservations about the way manifests are handled for the
> NetBSD repository. It's been a topic of discussion recently on IRC. The
> manifest processing itself currently takes nearly half of the total
> clone time and that looks ...suspicious at best.

Indeed.  I don't have the knowledge/experience to suggest improvements, but
I can run benchmarks :)

> > I'm guessing that they would have benefited from treemanifest.
>
> From my testing, treemanifests don't help at all.

They seemed to help with the jdk repo.  I'm guessing that jdk has a deeper
nested directories with longer file names because the conversion certainly
seemed to help (tm == treemanifest):

$ hg --config extensions.convert= convert ../jdk-hg . ../tm-map
$ cd ..
$ du -sAh */.{git,hg}
452M    jdk-git/.git
1.11G   jdk-hg/.hg
784M    jdk-tm/.hg

Not amazing, but it is about 70% of the "monolithic" manifest repo.  The
manifest part itself:

$ ls -lh 00*
-rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 25 20:46 00changelog.d
-rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 25 20:47 00changelog.i
-rw-r--r--   1 jeffpc   jeffpc     4.08M Jul 25 20:46 00manifest.d
-rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 25 20:47 00manifest.i

$ du -sAh meta    
89.4M   meta

So, the (treemanifest) manifest data is about 97M total vs. 437MB total with
the monolithic manifest.  This equates to 22% of the original manifest size.

...
> > I just kicked off a conversion to treemanifest.  It'll take a while.
>
> Did you convert to generaldelta and etc already?

'hg clone' produced a reasonable repo without conversion.  The only
requirement added during the conversion was treemanifest.

$ cat jdk-hg/.hg/requires
dotencode
fncache
generaldelta
revlogv1
sparserevlog
store
$ diff jdk-{hg,tm}/.hg/requires
6a7
> treemanifest

I can try other requirements, but I think the manifest problem jdk people
saw was the huge size due to data duplication inside the manifest data -
duplication that went away by manifest subtree "dedup" between revisions.

Jeff.

--
UNIX is user-friendly ... it's just selective about who its friends are
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Joerg Sonnenberger
On Sun, Jul 26, 2020 at 11:12:25AM -0400, Josef 'Jeff' Sipek wrote:
> > > I'm guessing that they would have benefited from treemanifest.
> >
> > From my testing, treemanifests don't help at all.
>
> They seemed to help with the jdk repo.  I'm guessing that jdk has a deeper
> nested directories with longer file names because the conversion certainly
> seemed to help (tm == treemanifest):

Can you run "hg debugupgraderepo -o re-delta-all" once? IIRC the
original repository doesn't use generaldelta and this would also affect
the manifest.

Joerg
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Josef 'Jeff' Sipek
On Sun, Jul 26, 2020 at 18:35:03 +0200, Joerg Sonnenberger wrote:

> On Sun, Jul 26, 2020 at 11:12:25AM -0400, Josef 'Jeff' Sipek wrote:
> > > > I'm guessing that they would have benefited from treemanifest.
> > >
> > > From my testing, treemanifests don't help at all.
> >
> > They seemed to help with the jdk repo.  I'm guessing that jdk has a deeper
> > nested directories with longer file names because the conversion certainly
> > seemed to help (tm == treemanifest):
>
> Can you run "hg debugupgraderepo -o re-delta-all" once? IIRC the
> original repository doesn't use generaldelta and this would also affect
> the manifest.

$ hg debugupgraderepo -o re-delta-all --run --no-backup
...
beginning upgrade...
repository locked and read-only
creating temporary repository to stage migrated data: /ws/tmp/jdk-hg/.hg/upgrade.UaS6Ss
(it is safe to interrupt this process any time before data migration completes)
migrating 637431 total revisions (516970 in filelogs, 60143 in manifests, 60318 in changelog)
migrating 1.07 GB in store; 298 GB tracked data
migrating 187542 filelogs containing 516970 revisions (625 MB in store; 11.9 GB tracked data)
finished migrating 516970 filelog revisions across 187542 filelogs; change in size: -2.14 MB
migrating 1 manifests containing 60143 revisions (438 MB in store; 286 GB tracked data)
finished migrating 60143 manifest revisions across 1 manifests; change in size: -382 MB
migrating changelog containing 60318 revisions (28.8 MB in store; 175 MB tracked data)
finished migrating 60318 changelog revisions; change in size: 0 bytes
finished migrating 637431 total revisions; total change in store size: -384 MB
copying phaseroots
...

Wow, that's a massive change to the manifest size!

-rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 26 13:55 00changelog.d
-rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 26 13:55 00changelog.i
-rw-r--r--   1 jeffpc   jeffpc     52.3M Jul 26 13:54 00manifest.d
-rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 26 13:54 00manifest.i

After the repo upgrade, I ran hg server and cloned it (non-streaming).  The
clone's manifest is somewhat larger but still reasonably sized:

-rw-r--r--   1 jeffpc  jeffpc    25M Jul 26 14:23 00changelog.d
-rw-r--r--   1 jeffpc  jeffpc   3.7M Jul 26 14:16 00changelog.i
-rw-r--r--   1 jeffpc  jeffpc    61M Jul 26 14:17 00manifest.d
-rw-r--r--   1 jeffpc  jeffpc   3.7M Jul 26 14:17 00manifest.i

Jeff.

--
mainframe, n.:
  An obsolete device still used by thousands of obsolete companies serving
  billions of obsolete customers and making huge obsolete profits for their
  obsolete shareholders. And this year's run twice as fast as last year's.
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Augie Fackler-2
In reply to this post by Joerg Sonnenberger
On Sat, Jul 25, 2020 at 10:19 PM Joerg Sonnenberger <[hidden email]> wrote:

>
> On Sat, Jul 25, 2020 at 01:36:32PM -0400, Josef 'Jeff' Sipek wrote:
> > First off, the clone itself.  I cloned it from the official upstream repos.
> > My internet connection is 150 Mbit/s, the storage is a 3-way ZFS mirror.  I
> > used hg 4.9.1 (py27), and git 2.21.0.  (I know, I need to update both.  This
> > is on a box that has a solid network connection but is harder to update.  If
> > there is interest I can spend the effort to update them and re-run it with
> > newer versions.)
>
> It should be noted that for all intends and purposes, a git clone is
> much more comparable to hg clone --stream.

One thing we did on Google Code that I've never been able to convince
someone to try is cache deltas: we had an outage caused by delta
computation being slow, and a side effect of that was caching deltas
pretty aggressively. That moved our servers from being CPU-bound to
being IO-bound on BigTable reads, and IIRC we were able to satisfy
pretty much any request at client-limited speeds from then on. It'd
probably still be a worthwhile effort to see about allowing memcached
or similar to store deltas for a server pool and let them avoid
significant amounts of delta computation.

AF
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Pierre-Yves David-2
In reply to this post by Josef 'Jeff' Sipek


On 7/25/20 7:36 PM, Josef 'Jeff' Sipek wrote:
> On Sat, Jul 25, 2020 at 12:27:42 +0200, Antonio Muci via Mercurial-devel wrote:
>> That's sad.
>
> Yeah.
>
> This motivated me enough to clone the repos (hg and git) and collect some
> data.  Maybe people here will find it useful.
I got int touch with the OpenJDK people one and half year ago. The
verison of Mercurial they use on the server is extremely old. The
repository format they use is ancient (not even general delta IIRC).

Moving to a modern Mercurial version, using sparse revlog for storage
and recomputing delta gave a massive boost to storage size and clone
performance.

However I never managed to get to even simply upgrade their mercurial
server side. Some of the issue OpenJDK had were legitimate concerns that
we could improve, but a good share was also lack of interrest in
actually improves their Mercurial situation. The crave to move to Github
for community reason was strong.

--
Pierre-Yves David
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Martin von Zweigbergk via Mercurial-devel
> Il 31/07/2020 17:55 Pierre-Yves David <[hidden email]> ha scritto:
>
> I got int touch with the OpenJDK people one and half year ago. [...]

Very active move on your part. Kudos.


> Moving to a modern Mercurial version, using sparse revlog for storage
> and recomputing delta gave a massive boost to storage size and clone
> performance.

At least this reassures that performance-wise mercurial has not fallen behind so much.
The tests performed by Josef and Joerg confirm that a performance disadvantage exists indeed, but it's not massive.

> a good share was also lack of interrest in
> actually improves their Mercurial situation. The crave to move to Github
> for community reason was strong.

I can understand wanting to benefit of the Github network effect, and do not want to focus on it here.

What concerns me the most are two things:


1. scripta manent: when in some years people will google for "mercurial performance" they will stumble upon JDK considerations, and take them form granted. What will remain in a potential user's head is "mercurial is slow, go for git. JDK guys have done the same". There is no other written material counterweighting these moves (except for very interesting blog entries by Gregory Szorc, possibly), and so the collective mindset slowly slips away.

2. (consequence of 1) no mindset that another valid SCM exists: SCM == GitHub, because - obviously - git == "hosted service with integrated issue tracker, CI and whatnot", right?

I am wondering if the countermeasures to this have to be only technical. I see this more as a communication disadvantage compared to the git ecosystem.
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Pierre-Yves David-2


On 7/31/20 6:30 PM, Antonio Muci wrote:
> I am wondering if the countermeasures to this have to be only technical. I see this more as a communication disadvantage compared to the git ecosystem.

We could definitely use more communication :-/

--
Pierre-Yves David
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Joerg Sonnenberger
In reply to this post by Martin von Zweigbergk via Mercurial-devel
On Fri, Jul 31, 2020 at 06:30:57PM +0200, Antonio Muci via Mercurial-devel wrote:
> What concerns me the most are two things:
>
> 1. scripta manent: when in some years people will google for "mercurial
> performance" they will stumble upon JDK considerations, and take them
> form granted. What will remain in a potential user's head is "mercurial
> is slow, go for git. JDK guys have done the same". There is no other
> written material counterweighting these moves (except for very
> interesting blog entries by Gregory Szorc, possibly), and so the
> collective mindset slowly slips away.

I fully agree with the problem and I've had to deal with the same issue
before. I consider the write-up from the OpenJDK people to be quite
dishonest, but there is little that we can do about it.

Joerg
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Josef 'Jeff' Sipek
In reply to this post by Martin von Zweigbergk via Mercurial-devel
On Fri, Jul 31, 2020 at 18:30:57 +0200, Antonio Muci wrote:
> > Il 31/07/2020 17:55 Pierre-Yves David <[hidden email]> ha scritto:
...
> > Moving to a modern Mercurial version, using sparse revlog for storage
> > and recomputing delta gave a massive boost to storage size and clone
> > performance.
>
> At least this reassures that performance-wise mercurial has not fallen
> behind so much.
> The tests performed by Josef and Joerg confirm that a performance
> disadvantage exists indeed, but it's not massive.

Keep in mind that I did only clone testing.  I use both hg and git (hg
because I want to, git because I have to), and I have to admit that
something as simple as 'hg log' / 'git log' feel completely different.
git's log output feel instantaneously on the screen, while hg's takes a
fraction of a second.  It is a small fraction, but it "feels" slower.  I
think this has been diagnosed over and over as slow python startup.

...
> What concerns me the most are two things:
>
> 1. scripta manent: when in some years people will google for "mercurial
> performance" they will stumble upon JDK considerations, and take them form
> granted. What will remain in a potential user's head is "mercurial is
> slow, go for git. JDK guys have done the same". There is no other written
> material counterweighting these moves (except for very interesting blog
> entries by Gregory Szorc, possibly), and so the collective mindset slowly
> slips away.

Around 2010, I messed quite a bit with the xfs file system in linux.  It was
really annoying that users found "tuning guide" slashdot posts from
2001-2003 that were completely wrong but they still kept finding them and
using them.  Often, this resulted in worse performance but the users were
also bad at benchmarking so they didn't notice until it was too late and
they file systems had a lot of data.  (I think it has gotten better, but
those horrid guides are still out there.)  In other words, it takes a *lot*
of effort to make sure people on the internet don't find misinformation.  I
don't really know how, but I think it needs to be a concentrated effort to
be "louder" than the misinformation.  (I consider outdated information
misinformation.)

Jeff.

--
All science is either physics or stamp collecting.
                - Ernest Rutherford
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Marcus Harnisch-2
In reply to this post by David Demelier-2
On 25/07/2020 10.11, David Demelier wrote:
> # HG changeset patch
> # User David Demelier <[hidden email]>
> # Date 1595664656 -7200
> #      Sat Jul 25 10:10:56 2020 +0200
> # Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
> # Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
> who: remove OpenJDK
>
> They unfortunately moved to GitHub.


Both, OpenJDK and NetBeans are still mentioned here:

   https://www.mercurial-scm.org/about

Perhaps these could be replaced with other large repos. Mozilla comes to
mind.

Cheers,
Marcus

_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Pulkit Goyal
On Wed, Aug 5, 2020 at 1:55 PM Marcus Harnisch <[hidden email]> wrote:

>
> On 25/07/2020 10.11, David Demelier wrote:
> > # HG changeset patch
> > # User David Demelier <[hidden email]>
> > # Date 1595664656 -7200
> > #      Sat Jul 25 10:10:56 2020 +0200
> > # Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
> > # Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
> > who: remove OpenJDK
> >
> > They unfortunately moved to GitHub.
>
>
> Both, OpenJDK and NetBeans are still mentioned here:
>
>    https://www.mercurial-scm.org/about

Oops, if possible can you email a patch for this. The website
repository lives at https://www.mercurial-scm.org/repo/hg-website/.
>
> Perhaps these could be replaced with other large repos. Mozilla comes to
> mind.

Yes, that sounds like a good replacement.
>
> Cheers,
> Marcus
>

Thanks and Regards
Pulkit
> _______________________________________________
> Mercurial-devel mailing list
> [hidden email]
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] who: remove OpenJDK

Marcus Harnisch-2
On 05/08/2020 10.47, Pulkit Goyal wrote:
> Oops, if possible can you email a patch for this. The website
> repository lives at https://www.mercurial-scm.org/repo/hg-website/.

Like that?

# HG changeset patch
# User Marcus Harnisch <[hidden email]>
# Date 1596620149 -7200
#      Wed Aug 05 11:35:49 2020 +0200
# Node ID 0c8591045dbaadd8043b592a627296183eabf633
# Parent  7eaad1ed8c743d40fe71620434f3a151f0067105
about: Replace OpenJDK and NetBeans with Mozilla

Both have moved moved to Git.

diff --git templates/about/index.html templates/about/index.html
--- templates/about/index.html
+++ templates/about/index.html
@@ -12,7 +12,7 @@

  <h2>Fast</h2>

-<p>Mercurial's implementation and data structures are designed to be
fast. You can generate diffs between revisions, or jump back in time
within seconds. Therefore Mercurial is perfectly suitable for large
projects such as OpenJDK (<a
href="http://hg.openjdk.java.net/jdk7/jdk7">hg</a>) or NetBeans (<a
href="http://hg.netbeans.org/">hg</a>).</p>
+<p>Mercurial's implementation and data structures are designed to be
fast. You can generate diffs between revisions, or jump back in time
within seconds. Therefore Mercurial is perfectly suitable for large
projects such as Mozilla (<a href="https://hg.mozilla.org/">hg</a>).</p>

  <h2>Platform independent</h2>


_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel