[PATCH 01 of 11 V2] tests: make subprocess handling reusable for different tests in test-stdio.py

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH 01 of 11 V2] tests: make subprocess handling reusable for different tests in test-stdio.py

Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594116820 -7200
#      Tue Jul 07 12:13:40 2020 +0200
# Node ID 2668345a92975b843cab2e766b8fa59181003772
# Parent  fa270dcbdb551766b3f244cd9e99b663e62e7696
# EXP-Topic stdio
tests: make subprocess handling reusable for different tests in test-stdio.py

diff --git a/tests/test-stdio.py b/tests/test-stdio.py
--- a/tests/test-stdio.py
+++ b/tests/test-stdio.py
@@ -14,7 +14,7 @@
 from mercurial import pycompat
 
 
-BUFFERING_CHILD_SCRIPT = r'''
+TEST_BUFFERING_CHILD_SCRIPT = r'''
 import os
 
 from mercurial import dispatch
@@ -81,24 +81,27 @@
 
 
 class TestStdio(unittest.TestCase):
-    def _test(self, stream, rwpair_generator, expected_output, python_args=[]):
+    def _test(
+        self,
+        child_script,
+        stream,
+        rwpair_generator,
+        check_output,
+        python_args=[],
+    ):
         assert stream in ('stdout', 'stderr')
         with rwpair_generator() as (stream_receiver, child_stream), open(
             os.devnull, 'rb'
         ) as child_stdin:
             proc = subprocess.Popen(
-                [sys.executable]
-                + python_args
-                + ['-c', BUFFERING_CHILD_SCRIPT.format(stream=stream)],
+                [sys.executable] + python_args + ['-c', child_script],
                 stdin=child_stdin,
                 stdout=child_stream if stream == 'stdout' else None,
                 stderr=child_stream if stream == 'stderr' else None,
             )
             try:
                 os.close(child_stream)
-                self.assertEqual(
-                    _readall(stream_receiver, 1024), expected_output
-                )
+                check_output(stream_receiver)
             except:  # re-raises
                 proc.terminate()
                 raise
@@ -106,17 +109,31 @@
                 retcode = proc.wait()
             self.assertEqual(retcode, 0)
 
+    def _test_buffering(
+        self, stream, rwpair_generator, expected_output, python_args=[]
+    ):
+        def check_output(stream_receiver):
+            self.assertEqual(_readall(stream_receiver, 1024), expected_output)
+
+        self._test(
+            TEST_BUFFERING_CHILD_SCRIPT.format(stream=stream),
+            stream,
+            rwpair_generator,
+            check_output,
+            python_args,
+        )
+
     def test_buffering_stdout_pipes(self):
-        self._test('stdout', _pipes, FULLY_BUFFERED)
+        self._test_buffering('stdout', _pipes, FULLY_BUFFERED)
 
     def test_buffering_stdout_ptys(self):
-        self._test('stdout', _ptys, LINE_BUFFERED)
+        self._test_buffering('stdout', _ptys, LINE_BUFFERED)
 
     def test_buffering_stdout_pipes_unbuffered(self):
-        self._test('stdout', _pipes, UNBUFFERED, python_args=['-u'])
+        self._test_buffering('stdout', _pipes, UNBUFFERED, python_args=['-u'])
 
     def test_buffering_stdout_ptys_unbuffered(self):
-        self._test('stdout', _ptys, UNBUFFERED, python_args=['-u'])
+        self._test_buffering('stdout', _ptys, UNBUFFERED, python_args=['-u'])
 
     if not pycompat.ispy3 and not pycompat.iswindows:
         # On Python 2 on non-Windows, we manually open stdout in line-buffered

_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 02 of 11 V2] procutil: ensure that all stdio file objects are flushed at interpreter exit

Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594346556 -7200
#      Fri Jul 10 04:02:36 2020 +0200
# Node ID a0ddc1349dde0f5849d196236c3fa31d49934511
# Parent  2668345a92975b843cab2e766b8fa59181003772
# EXP-Topic stdio
procutil: ensure that all stdio file objects are flushed at interpreter exit

On Python 2 on Unix, we open a second file object referring to the stdout file
descriptor. If this file object gets garbage collected before sys.stdout (which
is very likely), the file descriptor gets closed and sys.stdout has no chance
to flush buffered data. Therefore, we flush them explicitly before the file
objects get garbage collected and the file descriptor gets closed.

diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py
+++ b/mercurial/utils/procutil.py
@@ -50,6 +50,27 @@
         return False
 
 
+def reopen_stream(orig, mode, buffering):
+    import atexit
+
+    assert not pycompat.ispy3
+    # On Python 2, there's no way to create a file such that the file
+    # descriptor doesn't get closed if the file object gets garbage collected.
+    # Therefore, when reopening an already open stream, the file descriptor
+    # will be closed twice. The error message for that will be swallowed during
+    # interpreter shutdown.
+    new = os.fdopen(orig.fileno(), mode, buffering)
+
+    @atexit.register
+    def flush_streams():
+        # Ensure that one stream gets flushed before the other closes the file
+        # descriptor.
+        new.flush()
+        orig.flush()
+
+    return new
+
+
 class LineBufferedWrapper(object):
     def __init__(self, orig):
         self.orig = orig
@@ -104,7 +125,7 @@
         # Therefore we have a wrapper that implements line buffering.
         stdout = make_line_buffered(stdout)
     else:
-        stdout = os.fdopen(stdout.fileno(), 'wb', 1)
+        stdout = reopen_stream(stdout, 'wb', 1)
 
 
 findexe = platform.findexe
diff --git a/tests/test-stdio.py b/tests/test-stdio.py
--- a/tests/test-stdio.py
+++ b/tests/test-stdio.py
@@ -30,6 +30,16 @@
 LINE_BUFFERED = b'[written aaa]aaabbb\n[written bbb\\n]'
 FULLY_BUFFERED = b'[written aaa][written bbb\\n]aaabbb\n'
 
+TEST_FLUSH_STDIO_CHILD_SCRIPT = r'''
+import sys
+
+from mercurial import dispatch
+from mercurial.utils import procutil
+
+dispatch.initstdio()
+sys.{stream}.write('test')
+'''
+
 
 @contextlib.contextmanager
 def _closing(fds):
@@ -143,6 +153,29 @@
             test_buffering_stdout_ptys_unbuffered
         )
 
+    def _test_flush_stdio(self, stream, rwpair_generator):
+        def check_output(stream_receiver):
+            self.assertEqual(_readall(stream_receiver, 1024), b'test')
+
+        self._test(
+            TEST_FLUSH_STDIO_CHILD_SCRIPT.format(stream=stream),
+            stream,
+            rwpair_generator,
+            check_output,
+        )
+
+    def test_flush_stdout_pipes(self):
+        self._test_flush_stdio('stdout', _pipes)
+
+    def test_flush_stdout_ptys(self):
+        self._test_flush_stdio('stdout', _ptys)
+
+    def test_flush_stderr_pipes(self):
+        self._test_flush_stdio('stderr', _pipes)
+
+    def test_flush_stderr_ptys(self):
+        self._test_flush_stdio('stderr', _ptys)
+
 
 if __name__ == '__main__':
     import silenttestrunner

_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 03 of 11 V2] procutil: explain better why line buffering is not possible

Manuel Jacob
In reply to this post by Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594367738 -7200
#      Fri Jul 10 09:55:38 2020 +0200
# Node ID ba4ec481948de77e36c8267f1acf5b904efba761
# Parent  a0ddc1349dde0f5849d196236c3fa31d49934511
# EXP-Topic stdio
procutil: explain better why line buffering is not possible

The sentence “On Python 3, buffered binary streams can't be set line-buffered.”
was imprecise, as all streams are just Python classes and we can implement our
own (which we did).

diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py
+++ b/mercurial/utils/procutil.py
@@ -120,9 +120,10 @@
 # buffering.
 if isatty(stdout):
     if pycompat.ispy3 or pycompat.iswindows:
-        # On Python 3, buffered binary streams can't be set line-buffered.
-        # On Python 2, Windows doesn't support line buffering.
-        # Therefore we have a wrapper that implements line buffering.
+        # Python 3 implements its own I/O streams.
+        # The standard library doesn't offer line-buffered binary streams.
+        # Python 2 uses the I/O streams provided by the C library.
+        # The Windows C runtime library doesn't support line buffering.
         stdout = make_line_buffered(stdout)
     else:
         stdout = reopen_stream(stdout, 'wb', 1)
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 04 of 11 V2] procutil: split if condition

Manuel Jacob
In reply to this post by Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594367976 -7200
#      Fri Jul 10 09:59:36 2020 +0200
# Node ID d7f793516419e60b23c5a667c57972a6d1ac0f0f
# Parent  ba4ec481948de77e36c8267f1acf5b904efba761
# EXP-Topic stdio
procutil: split if condition

This prepares the code for subsequent changes where we need to differentiate
between the two cases.

diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py
+++ b/mercurial/utils/procutil.py
@@ -119,9 +119,11 @@
 # destined stdout with a pipe destined stdout (e.g. pager), we want line
 # buffering.
 if isatty(stdout):
-    if pycompat.ispy3 or pycompat.iswindows:
+    if pycompat.ispy3:
         # Python 3 implements its own I/O streams.
         # The standard library doesn't offer line-buffered binary streams.
+        stdout = make_line_buffered(stdout)
+    elif pycompat.iswindows:
         # Python 2 uses the I/O streams provided by the C library.
         # The Windows C runtime library doesn't support line buffering.
         stdout = make_line_buffered(stdout)

_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 05 of 11 V2] procutil: use mercurial.windows.winstdout only on Python 2 and TTYs

Manuel Jacob
In reply to this post by Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594368185 -7200
#      Fri Jul 10 10:03:05 2020 +0200
# Node ID d439863e9340a845d2f1986ec45f40f38704e932
# Parent  d7f793516419e60b23c5a667c57972a6d1ac0f0f
# EXP-Topic stdio
procutil: use mercurial.windows.winstdout only on Python 2 and TTYs

Python 3 already works around the bug. The workaround is only needed when
writing to consoles. If stdout is a console, sys.stdout.isatty() is true.

diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py
+++ b/mercurial/utils/procutil.py
@@ -112,9 +112,6 @@
     stdout = sys.stdout
     stderr = sys.stderr
 
-if pycompat.iswindows:
-    stdout = platform.winstdout(stdout)
-
 # glibc determines buffering on first write to stdout - if we replace a TTY
 # destined stdout with a pipe destined stdout (e.g. pager), we want line
 # buffering.
@@ -124,6 +121,8 @@
         # The standard library doesn't offer line-buffered binary streams.
         stdout = make_line_buffered(stdout)
     elif pycompat.iswindows:
+        # Work around size limit when writing to console.
+        stdout = platform.winstdout(stdout)
         # Python 2 uses the I/O streams provided by the C library.
         # The Windows C runtime library doesn't support line buffering.
         stdout = make_line_buffered(stdout)

_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 06 of 11 V2] procutil: move comment closer to relevant code

Manuel Jacob
In reply to this post by Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594368776 -7200
#      Fri Jul 10 10:12:56 2020 +0200
# Node ID 472cbd166fe075b8786d2965a44b4e5d1c26bdf5
# Parent  d439863e9340a845d2f1986ec45f40f38704e932
# EXP-Topic stdio
procutil: move comment closer to relevant code

diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py
+++ b/mercurial/utils/procutil.py
@@ -112,9 +112,6 @@
     stdout = sys.stdout
     stderr = sys.stderr
 
-# glibc determines buffering on first write to stdout - if we replace a TTY
-# destined stdout with a pipe destined stdout (e.g. pager), we want line
-# buffering.
 if isatty(stdout):
     if pycompat.ispy3:
         # Python 3 implements its own I/O streams.
@@ -127,6 +124,9 @@
         # The Windows C runtime library doesn't support line buffering.
         stdout = make_line_buffered(stdout)
     else:
+        # glibc determines buffering on first write to stdout - if we
+        # replace a TTY destined stdout with a pipe destined stdout (e.g.
+        # pager), we want line buffering.
         stdout = reopen_stream(stdout, 'wb', 1)
 
 

_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 07 of 11 V2] procutil: distribute code for stdout

Manuel Jacob
In reply to this post by Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594368724 -7200
#      Fri Jul 10 10:12:04 2020 +0200
# Node ID 2151604d856a8cd9c2049e785a995714f49be115
# Parent  472cbd166fe075b8786d2965a44b4e5d1c26bdf5
# EXP-Topic stdio
procutil: distribute code for stdout

It makes sense to have the distinction between Python 2 and 3 at the top level,
as we have to fight a different kind of battle on each: On Python 3, we get
consistent behavior on all platforms, but need to create correctly-behaving
binary streams. On Python 2, we have to account for platform differences.

diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py
+++ b/mercurial/utils/procutil.py
@@ -102,32 +102,31 @@
 
 
 if pycompat.ispy3:
+    # Python 3 implements its own I/O streams.
     # TODO: .buffer might not exist if std streams were replaced; we'll need
     # a silly wrapper to make a bytes stream backed by a unicode one.
     stdin = sys.stdin.buffer
     stdout = sys.stdout.buffer
     stderr = sys.stderr.buffer
+    if isatty(stdout):
+        # The standard library doesn't offer line-buffered binary streams.
+        stdout = make_line_buffered(stdout)
 else:
+    # Python 2 uses the I/O streams provided by the C library.
     stdin = sys.stdin
     stdout = sys.stdout
     stderr = sys.stderr
-
-if isatty(stdout):
-    if pycompat.ispy3:
-        # Python 3 implements its own I/O streams.
-        # The standard library doesn't offer line-buffered binary streams.
-        stdout = make_line_buffered(stdout)
-    elif pycompat.iswindows:
-        # Work around size limit when writing to console.
-        stdout = platform.winstdout(stdout)
-        # Python 2 uses the I/O streams provided by the C library.
-        # The Windows C runtime library doesn't support line buffering.
-        stdout = make_line_buffered(stdout)
-    else:
-        # glibc determines buffering on first write to stdout - if we
-        # replace a TTY destined stdout with a pipe destined stdout (e.g.
-        # pager), we want line buffering.
-        stdout = reopen_stream(stdout, 'wb', 1)
+    if isatty(stdout):
+        if pycompat.iswindows:
+            # Work around size limit when writing to console.
+            stdout = platform.winstdout(stdout)
+            # The Windows C runtime library doesn't support line buffering.
+            stdout = make_line_buffered(stdout)
+        else:
+            # glibc determines buffering on first write to stdout - if we
+            # replace a TTY destined stdout with a pipe destined stdout (e.g.
+            # pager), we want line buffering.
+            stdout = reopen_stream(stdout, 'wb', 1)
 
 
 findexe = platform.findexe

_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 08 of 11 V2] procutil: move assignments

Manuel Jacob
In reply to this post by Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594446424 -7200
#      Sat Jul 11 07:47:04 2020 +0200
# Node ID c904a1769e30652813eb1df619e8ee7d8feb6102
# Parent  2151604d856a8cd9c2049e785a995714f49be115
# EXP-Topic stdio
procutil: move assignments

This should probably be part of the previous patch, but folding it results in a
less useful word diff, so I decided to keep it separate for review.

diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py
+++ b/mercurial/utils/procutil.py
@@ -107,15 +107,14 @@
     # a silly wrapper to make a bytes stream backed by a unicode one.
     stdin = sys.stdin.buffer
     stdout = sys.stdout.buffer
-    stderr = sys.stderr.buffer
     if isatty(stdout):
         # The standard library doesn't offer line-buffered binary streams.
         stdout = make_line_buffered(stdout)
+    stderr = sys.stderr.buffer
 else:
     # Python 2 uses the I/O streams provided by the C library.
     stdin = sys.stdin
     stdout = sys.stdout
-    stderr = sys.stderr
     if isatty(stdout):
         if pycompat.iswindows:
             # Work around size limit when writing to console.
@@ -127,6 +126,7 @@
             # replace a TTY destined stdout with a pipe destined stdout (e.g.
             # pager), we want line buffering.
             stdout = reopen_stream(stdout, 'wb', 1)
+    stderr = sys.stderr
 
 
 findexe = platform.findexe

_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 09 of 11 V2] procutil: ensure that procutil.std{out,err}.write() writes all bytes

Manuel Jacob
In reply to this post by Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594376878 -7200
#      Fri Jul 10 12:27:58 2020 +0200
# Node ID 4adabe764af5c4104da59140a04c1102592e60c1
# Parent  c904a1769e30652813eb1df619e8ee7d8feb6102
# EXP-Topic stdio
procutil: ensure that procutil.std{out,err}.write() writes all bytes

Python 3 offers different kind of streams and it’s not guaranteed for all of
them that calling write() writes all bytes.

When Python is started in unbuffered mode, sys.std{out,err}.buffer are
instances of io.FileIO, whose write() can write less bytes for
platform-specific reasons (e.g. Linux has a 0x7ffff000 bytes maximum and could
write less if interrupted by a signal; when writing to Windows consoles, it’s
limited to 32767 bytes to avoid the "not enough space" error). This can lead to
silent loss of data, both when using sys.std{out,err}.buffer (which may in fact
not be a buffered stream) and when using the text streams sys.std{out,err}
(I’ve created a CPython bug report for that:
https://bugs.python.org/issue41221).

Python may fix the problem at some point. For now, we implement our own wrapper
for procutil.std{out,err} that calls the raw stream’s write() method until all
bytes have been written. We don’t use sys.std{out,err} for larger writes, so I
think it’s not worth the effort to patch them.

diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
--- a/mercurial/utils/procutil.py
+++ b/mercurial/utils/procutil.py
@@ -101,16 +101,49 @@
     return LineBufferedWrapper(stream)
 
 
+class WriteAllWrapper(object):
+    def __init__(self, orig):
+        self.orig = orig
+
+    def __getattr__(self, attr):
+        return getattr(self.orig, attr)
+
+    def write(self, s):
+        write1 = self.orig.write
+        m = memoryview(s)
+        total_to_write = len(s)
+        total_written = 0
+        while total_written < total_to_write:
+            total_written += write1(m[total_written:])
+        return total_written
+
+
+io.IOBase.register(WriteAllWrapper)
+
+
+def make_write_all(stream):
+    assert pycompat.ispy3
+    if isinstance(stream, WriteAllWrapper):
+        return stream
+    if isinstance(stream, io.BufferedIOBase):
+        # The io.BufferedIOBase.write() contract guarantees that all data is
+        # written.
+        return stream
+    # In general, the write() method of streams is free to write only part of
+    # the data.
+    return WriteAllWrapper(stream)
+
+
 if pycompat.ispy3:
     # Python 3 implements its own I/O streams.
     # TODO: .buffer might not exist if std streams were replaced; we'll need
     # a silly wrapper to make a bytes stream backed by a unicode one.
     stdin = sys.stdin.buffer
-    stdout = sys.stdout.buffer
+    stdout = make_write_all(sys.stdout.buffer)
     if isatty(stdout):
         # The standard library doesn't offer line-buffered binary streams.
         stdout = make_line_buffered(stdout)
-    stderr = sys.stderr.buffer
+    stderr = make_write_all(sys.stderr.buffer)
 else:
     # Python 2 uses the I/O streams provided by the C library.
     stdin = sys.stdin
diff --git a/tests/test-stdio.py b/tests/test-stdio.py
--- a/tests/test-stdio.py
+++ b/tests/test-stdio.py
@@ -7,6 +7,7 @@
 import contextlib
 import errno
 import os
+import signal
 import subprocess
 import sys
 import unittest
@@ -41,6 +42,19 @@
 '''
 
 
+TEST_LARGE_WRITE_CHILD_SCRIPT = r'''
+import signal
+import sys
+
+from mercurial import dispatch
+from mercurial.utils import procutil
+
+signal.signal(signal.SIGINT, lambda *x: None)
+dispatch.initstdio()
+procutil.{stream}.write(b'x' * 1048576)
+'''
+
+
 @contextlib.contextmanager
 def _closing(fds):
     try:
@@ -73,8 +87,8 @@
         yield rwpair
 
 
-def _readall(fd, buffer_size):
-    buf = []
+def _readall(fd, buffer_size, initial_buf=None):
+    buf = initial_buf or []
     while True:
         try:
             s = os.read(fd, buffer_size)
@@ -111,7 +125,7 @@
             )
             try:
                 os.close(child_stream)
-                check_output(stream_receiver)
+                check_output(stream_receiver, proc)
             except:  # re-raises
                 proc.terminate()
                 raise
@@ -122,7 +136,7 @@
     def _test_buffering(
         self, stream, rwpair_generator, expected_output, python_args=[]
     ):
-        def check_output(stream_receiver):
+        def check_output(stream_receiver, proc):
             self.assertEqual(_readall(stream_receiver, 1024), expected_output)
 
         self._test(
@@ -154,7 +168,7 @@
         )
 
     def _test_flush_stdio(self, stream, rwpair_generator):
-        def check_output(stream_receiver):
+        def check_output(stream_receiver, proc):
             self.assertEqual(_readall(stream_receiver, 1024), b'test')
 
         self._test(
@@ -176,6 +190,61 @@
     def test_flush_stderr_ptys(self):
         self._test_flush_stdio('stderr', _ptys)
 
+    def _test_large_write(self, stream, rwpair_generator, python_args=[]):
+        if not pycompat.ispy3 and pycompat.isdarwin:
+            # Python 2 doesn't always retry on EINTR, but the libc might retry.
+            # So far, it was observed only on macOS that EINTR is raised at the
+            # Python level. As Python 2 support will be dropped soon-ish, we
+            # won't attempt to fix it.
+            raise unittest.SkipTest("raises EINTR on macOS")
+
+        def check_output(stream_receiver, proc):
+            if not pycompat.iswindows:
+                # On Unix, we can provoke a partial write() by interrupting it
+                # by a signal handler as soon as a bit of data was written.
+                # We test that write() is called until all data is written.
+                buf = [os.read(stream_receiver, 1)]
+                proc.send_signal(signal.SIGINT)
+            else:
+                # On Windows, there doesn't seem to be a way to cause partial
+                # writes.
+                buf = []
+            self.assertEqual(
+                _readall(stream_receiver, 131072, buf), b'x' * 1048576
+            )
+
+        self._test(
+            TEST_LARGE_WRITE_CHILD_SCRIPT.format(stream=stream),
+            stream,
+            rwpair_generator,
+            check_output,
+            python_args,
+        )
+
+    def test_large_write_stdout_pipes(self):
+        self._test_large_write('stdout', _pipes)
+
+    def test_large_write_stdout_ptys(self):
+        self._test_large_write('stdout', _ptys)
+
+    def test_large_write_stdout_pipes_unbuffered(self):
+        self._test_large_write('stdout', _pipes, python_args=['-u'])
+
+    def test_large_write_stdout_ptys_unbuffered(self):
+        self._test_large_write('stdout', _ptys, python_args=['-u'])
+
+    def test_large_write_stderr_pipes(self):
+        self._test_large_write('stderr', _pipes)
+
+    def test_large_write_stderr_ptys(self):
+        self._test_large_write('stderr', _ptys)
+
+    def test_large_write_stderr_pipes_unbuffered(self):
+        self._test_large_write('stderr', _pipes, python_args=['-u'])
+
+    def test_large_write_stderr_ptys_unbuffered(self):
+        self._test_large_write('stderr', _ptys, python_args=['-u'])
+
 
 if __name__ == '__main__':
     import silenttestrunner
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 10 of 11 V2] tests: add tests for when stdout or stderr is connected to `os.devnull`

Manuel Jacob
In reply to this post by Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594291962 -7200
#      Thu Jul 09 12:52:42 2020 +0200
# Node ID 533e609683424b17e4dfd04b126d8ee9d081730b
# Parent  4adabe764af5c4104da59140a04c1102592e60c1
# EXP-Topic stdio
tests: add tests for when stdout or stderr is connected to `os.devnull`

The original motivation was that creating PTYs on Windows is not possible, but
`NUL` is recognized as a TTY, so we can have at least some test coverage for
the TTY case. I think it doesn’t hurt to run the test cases on all systems.

diff --git a/tests/test-stdio.py b/tests/test-stdio.py
--- a/tests/test-stdio.py
+++ b/tests/test-stdio.py
@@ -68,6 +68,13 @@
 
 
 @contextlib.contextmanager
+def _devnull():
+    devnull = os.open(os.devnull, os.O_WRONLY)
+    with _closing([devnull]):
+        yield (None, devnull)
+
+
+@contextlib.contextmanager
 def _pipes():
     rwpair = os.pipe()
     with _closing(rwpair):
@@ -125,7 +132,8 @@
             )
             try:
                 os.close(child_stream)
-                check_output(stream_receiver, proc)
+                if stream_receiver is not None:
+                    check_output(stream_receiver, proc)
             except:  # re-raises
                 proc.terminate()
                 raise
@@ -147,12 +155,18 @@
             python_args,
         )
 
+    def test_buffering_stdout_devnull(self):
+        self._test_buffering('stdout', _devnull, None)
+
     def test_buffering_stdout_pipes(self):
         self._test_buffering('stdout', _pipes, FULLY_BUFFERED)
 
     def test_buffering_stdout_ptys(self):
         self._test_buffering('stdout', _ptys, LINE_BUFFERED)
 
+    def test_buffering_stdout_devnull_unbuffered(self):
+        self._test_buffering('stdout', _devnull, None, python_args=['-u'])
+
     def test_buffering_stdout_pipes_unbuffered(self):
         self._test_buffering('stdout', _pipes, UNBUFFERED, python_args=['-u'])
 
@@ -221,24 +235,36 @@
             python_args,
         )
 
+    def test_large_write_stdout_devnull(self):
+        self._test_large_write('stdout', _devnull)
+
     def test_large_write_stdout_pipes(self):
         self._test_large_write('stdout', _pipes)
 
     def test_large_write_stdout_ptys(self):
         self._test_large_write('stdout', _ptys)
 
+    def test_large_write_stdout_devnull_unbuffered(self):
+        self._test_large_write('stdout', _devnull, python_args=['-u'])
+
     def test_large_write_stdout_pipes_unbuffered(self):
         self._test_large_write('stdout', _pipes, python_args=['-u'])
 
     def test_large_write_stdout_ptys_unbuffered(self):
         self._test_large_write('stdout', _ptys, python_args=['-u'])
 
+    def test_large_write_stderr_devnull(self):
+        self._test_large_write('stderr', _devnull)
+
     def test_large_write_stderr_pipes(self):
         self._test_large_write('stderr', _pipes)
 
     def test_large_write_stderr_ptys(self):
         self._test_large_write('stderr', _ptys)
 
+    def test_large_write_stderr_devnull_unbuffered(self):
+        self._test_large_write('stderr', _devnull, python_args=['-u'])
+
     def test_large_write_stderr_pipes_unbuffered(self):
         self._test_large_write('stderr', _pipes, python_args=['-u'])
 
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH 11 of 11 V2] tests: check that procutil.std{out,err}.write() returns correct result

Manuel Jacob
In reply to this post by Manuel Jacob
# HG changeset patch
# User Manuel Jacob <[hidden email]>
# Date 1594291924 -7200
#      Thu Jul 09 12:52:04 2020 +0200
# Node ID d8ead3e59d9d5ac0d96cbd52baa906d174b88c08
# Parent  533e609683424b17e4dfd04b126d8ee9d081730b
# EXP-Topic stdio
tests: check that procutil.std{out,err}.write() returns correct result

On Windows, we currently don’t fully test the case when the stream is connected
to a TTY, but we test the child process side by connecting them to NUL, which
is recognized as a TTY by Python. To make the large write test a bit more
useful besides checking that it doesn’t crash, we can check that the write()
method returns the correct result.

diff --git a/tests/test-stdio.py b/tests/test-stdio.py
--- a/tests/test-stdio.py
+++ b/tests/test-stdio.py
@@ -10,6 +10,7 @@
 import signal
 import subprocess
 import sys
+import tempfile
 import unittest
 
 from mercurial import pycompat
@@ -51,7 +52,9 @@
 
 signal.signal(signal.SIGINT, lambda *x: None)
 dispatch.initstdio()
-procutil.{stream}.write(b'x' * 1048576)
+write_result = procutil.{stream}.write(b'x' * 1048576)
+with open({write_result_fn}, 'w') as write_result_f:
+    write_result_f.write(str(write_result))
 '''
 
 
@@ -119,6 +122,7 @@
         rwpair_generator,
         check_output,
         python_args=[],
+        post_child_check=None,
     ):
         assert stream in ('stdout', 'stderr')
         with rwpair_generator() as (stream_receiver, child_stream), open(
@@ -140,6 +144,8 @@
             finally:
                 retcode = proc.wait()
             self.assertEqual(retcode, 0)
+            if post_child_check is not None:
+                post_child_check()
 
     def _test_buffering(
         self, stream, rwpair_generator, expected_output, python_args=[]
@@ -227,13 +233,39 @@
                 _readall(stream_receiver, 131072, buf), b'x' * 1048576
             )
 
-        self._test(
-            TEST_LARGE_WRITE_CHILD_SCRIPT.format(stream=stream),
-            stream,
-            rwpair_generator,
-            check_output,
-            python_args,
-        )
+        def post_child_check():
+            with open(write_result_fn, 'r') as write_result_f:
+                write_result_str = write_result_f.read()
+            if pycompat.ispy3:
+                # On Python 3, we test that the correct number of bytes is
+                # claimed to have been written.
+                expected_write_result_str = '1048576'
+            else:
+                # On Python 2, we only check that the large write does not
+                # crash.
+                expected_write_result_str = 'None'
+            self.assertEqual(write_result_str, expected_write_result_str)
+
+        try:
+            # tempfile.mktemp() is unsafe in general, as a malicious process
+            # could create the file before we do. But in tests, we're running
+            # in a controlled environment.
+            write_result_fn = tempfile.mktemp()
+            self._test(
+                TEST_LARGE_WRITE_CHILD_SCRIPT.format(
+                    stream=stream, write_result_fn=repr(write_result_fn)
+                ),
+                stream,
+                rwpair_generator,
+                check_output,
+                python_args,
+                post_child_check=post_child_check,
+            )
+        finally:
+            try:
+                os.unlink(write_result_fn)
+            except OSError:
+                pass
 
     def test_large_write_stdout_devnull(self):
         self._test_large_write('stdout', _devnull)
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 07 of 11 V2] procutil: distribute code for stdout

Manuel Jacob
In reply to this post by Manuel Jacob
This is not as much of an improvement as it was before the code for
stderr was removed. But I think it’s still a little improvement and a
later patch depends on it, so I decided to send it anyway.

On 2020-07-13 00:41, Manuel Jacob wrote:

> # HG changeset patch
> # User Manuel Jacob <[hidden email]>
> # Date 1594368724 -7200
> #      Fri Jul 10 10:12:04 2020 +0200
> # Node ID 2151604d856a8cd9c2049e785a995714f49be115
> # Parent  472cbd166fe075b8786d2965a44b4e5d1c26bdf5
> # EXP-Topic stdio
> procutil: distribute code for stdout
>
> It makes sense to have the distinction between Python 2 and 3 at the
> top level,
> as we have to fight a different kind of battle on each: On Python 3, we
> get
> consistent behavior on all platforms, but need to create
> correctly-behaving
> binary streams. On Python 2, we have to account for platform
> differences.
>
> diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
> --- a/mercurial/utils/procutil.py
> +++ b/mercurial/utils/procutil.py
> @@ -102,32 +102,31 @@
>
>
>  if pycompat.ispy3:
> +    # Python 3 implements its own I/O streams.
>      # TODO: .buffer might not exist if std streams were replaced;
> we'll need
>      # a silly wrapper to make a bytes stream backed by a unicode one.
>      stdin = sys.stdin.buffer
>      stdout = sys.stdout.buffer
>      stderr = sys.stderr.buffer
> +    if isatty(stdout):
> +        # The standard library doesn't offer line-buffered binary
> streams.
> +        stdout = make_line_buffered(stdout)
>  else:
> +    # Python 2 uses the I/O streams provided by the C library.
>      stdin = sys.stdin
>      stdout = sys.stdout
>      stderr = sys.stderr
> -
> -if isatty(stdout):
> -    if pycompat.ispy3:
> -        # Python 3 implements its own I/O streams.
> -        # The standard library doesn't offer line-buffered binary
> streams.
> -        stdout = make_line_buffered(stdout)
> -    elif pycompat.iswindows:
> -        # Work around size limit when writing to console.
> -        stdout = platform.winstdout(stdout)
> -        # Python 2 uses the I/O streams provided by the C library.
> -        # The Windows C runtime library doesn't support line
> buffering.
> -        stdout = make_line_buffered(stdout)
> -    else:
> -        # glibc determines buffering on first write to stdout - if we
> -        # replace a TTY destined stdout with a pipe destined stdout
> (e.g.
> -        # pager), we want line buffering.
> -        stdout = reopen_stream(stdout, 'wb', 1)
> +    if isatty(stdout):
> +        if pycompat.iswindows:
> +            # Work around size limit when writing to console.
> +            stdout = platform.winstdout(stdout)
> +            # The Windows C runtime library doesn't support line
> buffering.
> +            stdout = make_line_buffered(stdout)
> +        else:
> +            # glibc determines buffering on first write to stdout - if
> we
> +            # replace a TTY destined stdout with a pipe destined
> stdout (e.g.
> +            # pager), we want line buffering.
> +            stdout = reopen_stream(stdout, 'wb', 1)
>
>
>  findexe = platform.findexe
>
> _______________________________________________
> Mercurial-devel mailing list
> [hidden email]
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 02 of 11 V2] procutil: ensure that all stdio file objects are flushed at interpreter exit

Yuya Nishihara
In reply to this post by Manuel Jacob
On Mon, 13 Jul 2020 00:41:27 +0200, Manuel Jacob wrote:

> # HG changeset patch
> # User Manuel Jacob <[hidden email]>
> # Date 1594346556 -7200
> #      Fri Jul 10 04:02:36 2020 +0200
> # Node ID a0ddc1349dde0f5849d196236c3fa31d49934511
> # Parent  2668345a92975b843cab2e766b8fa59181003772
> # EXP-Topic stdio
> procutil: ensure that all stdio file objects are flushed at interpreter exit
>
> On Python 2 on Unix, we open a second file object referring to the stdout file
> descriptor. If this file object gets garbage collected before sys.stdout (which
> is very likely), the file descriptor gets closed and sys.stdout has no chance
> to flush buffered data. Therefore, we flush them explicitly before the file
> objects get garbage collected and the file descriptor gets closed.
>
> diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
> --- a/mercurial/utils/procutil.py
> +++ b/mercurial/utils/procutil.py
> @@ -50,6 +50,27 @@
>          return False
>  
>  
> +def reopen_stream(orig, mode, buffering):
> +    import atexit
> +
> +    assert not pycompat.ispy3
> +    # On Python 2, there's no way to create a file such that the file
> +    # descriptor doesn't get closed if the file object gets garbage collected.
> +    # Therefore, when reopening an already open stream, the file descriptor
> +    # will be closed twice. The error message for that will be swallowed during
> +    # interpreter shutdown.
> +    new = os.fdopen(orig.fileno(), mode, buffering)
> +
> +    @atexit.register
> +    def flush_streams():
> +        # Ensure that one stream gets flushed before the other closes the file
> +        # descriptor.
> +        new.flush()
> +        orig.flush()
> +
> +    return new

Didn't we agree that the use of atexit wouldn't buy and we'll instead add
py3-only solution later?

As I said, atexit will increase the complexity of the control flow. All data
should be flushed inside dispatch.run().
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 01 of 11 V2] tests: make subprocess handling reusable for different tests in test-stdio.py

Yuya Nishihara
In reply to this post by Manuel Jacob
On Mon, 13 Jul 2020 00:41:26 +0200, Manuel Jacob wrote:
> # HG changeset patch
> # User Manuel Jacob <[hidden email]>
> # Date 1594116820 -7200
> #      Tue Jul 07 12:13:40 2020 +0200
> # Node ID 2668345a92975b843cab2e766b8fa59181003772
> # Parent  fa270dcbdb551766b3f244cd9e99b663e62e7696
> # EXP-Topic stdio
> tests: make subprocess handling reusable for different tests in test-stdio.py

Queued 1, 3, 4, 5, 6, thanks.
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 02 of 11 V2] procutil: ensure that all stdio file objects are flushed at interpreter exit

Manuel Jacob
In reply to this post by Yuya Nishihara
On 2020-07-13 13:27, Yuya Nishihara wrote:

> On Mon, 13 Jul 2020 00:41:27 +0200, Manuel Jacob wrote:
>> # HG changeset patch
>> # User Manuel Jacob <[hidden email]>
>> # Date 1594346556 -7200
>> #      Fri Jul 10 04:02:36 2020 +0200
>> # Node ID a0ddc1349dde0f5849d196236c3fa31d49934511
>> # Parent  2668345a92975b843cab2e766b8fa59181003772
>> # EXP-Topic stdio
>> procutil: ensure that all stdio file objects are flushed at
>> interpreter exit
>>
>> On Python 2 on Unix, we open a second file object referring to the
>> stdout file
>> descriptor. If this file object gets garbage collected before
>> sys.stdout (which
>> is very likely), the file descriptor gets closed and sys.stdout has no
>> chance
>> to flush buffered data. Therefore, we flush them explicitly before the
>> file
>> objects get garbage collected and the file descriptor gets closed.
>>
>> diff --git a/mercurial/utils/procutil.py b/mercurial/utils/procutil.py
>> --- a/mercurial/utils/procutil.py
>> +++ b/mercurial/utils/procutil.py
>> @@ -50,6 +50,27 @@
>>          return False
>>
>>
>> +def reopen_stream(orig, mode, buffering):
>> +    import atexit
>> +
>> +    assert not pycompat.ispy3
>> +    # On Python 2, there's no way to create a file such that the file
>> +    # descriptor doesn't get closed if the file object gets garbage
>> collected.
>> +    # Therefore, when reopening an already open stream, the file
>> descriptor
>> +    # will be closed twice. The error message for that will be
>> swallowed during
>> +    # interpreter shutdown.
>> +    new = os.fdopen(orig.fileno(), mode, buffering)
>> +
>> +    @atexit.register
>> +    def flush_streams():
>> +        # Ensure that one stream gets flushed before the other closes
>> the file
>> +        # descriptor.
>> +        new.flush()
>> +        orig.flush()
>> +
>> +    return new
>
> Didn't we agree that the use of atexit wouldn't buy and we'll instead
> add
> py3-only solution later?
>
> As I said, atexit will increase the complexity of the control flow. All
> data
> should be flushed inside dispatch.run().

We talked past each other, then. What I meant is that for stderr it’s
not worth it because 1) we have other reasons to back it out, 2) stderr
may be used for tracebacks etc. even after the atexit hook. For stdout
it’s not as important because stdout is mostly written to through the ui
object, but still there are some contrib scripts etc. using it directly.
I don’t see any problem using atexit for this particular use case, but
since I didn’t invest further time trying to find a case where we need
it for stdout (and the stderr problem was solved otherwise), I’d be fine
dropping the patch. On Python 3, we don’t need to do anything. What do
you think?
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 02 of 11 V2] procutil: ensure that all stdio file objects are flushed at interpreter exit

Yuya Nishihara
On Mon, 13 Jul 2020 17:14:31 +0200, Manuel Jacob wrote:

> > Didn't we agree that the use of atexit wouldn't buy and we'll instead
> > add
> > py3-only solution later?
> >
> > As I said, atexit will increase the complexity of the control flow. All
> > data
> > should be flushed inside dispatch.run().
>
> We talked past each other, then. What I meant is that for stderr it’s
> not worth it because 1) we have other reasons to back it out, 2) stderr
> may be used for tracebacks etc. even after the atexit hook. For stdout
> it’s not as important because stdout is mostly written to through the ui
> object, but still there are some contrib scripts etc. using it directly.

I see. I meant the use of atexit isn't nice at all. Sorry for confusion.

> I don’t see any problem using atexit for this particular use case, but
> since I didn’t invest further time trying to find a case where we need
> it for stdout (and the stderr problem was solved otherwise), I’d be fine
> dropping the patch. On Python 3, we don’t need to do anything. What do
> you think?

I prefer dropping this patch.

Historically we've replaced atexit() with custom ui.atexit() function for
better control of errors and execution flow, see c13ff31818b0. I don't
remember the exact problem, but doing I/O in atexit() would be undesired.
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel