Quantcast

[PATCH 1 of 2 v2] match: adding support for matching files inside a directory

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH 1 of 2 v2] match: adding support for matching files inside a directory

Martin von Zweigbergk via Mercurial-devel
# HG changeset patch
# User Rodrigo Damazio Bovendorp <[hidden email]>
# Date 1487029169 28800
#      Mon Feb 13 15:39:29 2017 -0800
# Node ID 94264a6e6672c917d42518f7ae9322445868d067
# Parent  72f25e17af9d6a206ea374c30f229ae9513f3f23
match: adding support for matching files inside a directory

This adds a new "rootfilesin" matcher type which matches files inside a
directory, but not any subdirectories (so it matches non-recursively).
This has the "root" prefix per foozy's plan for other matchers (rootglob,
rootpath, cwdre, etc.).

diff -r 72f25e17af9d -r 94264a6e6672 mercurial/help/patterns.txt
--- a/mercurial/help/patterns.txt Mon Feb 13 02:31:56 2017 -0800
+++ b/mercurial/help/patterns.txt Mon Feb 13 15:39:29 2017 -0800
@@ -13,7 +13,10 @@
 
 To use a plain path name without any pattern matching, start it with
 ``path:``. These path names must completely match starting at the
-current repository root.
+current repository root, and when the path points to a directory, it is matched
+recursively. To match all files in a directory non-recursively (not including
+any files in subdirectories), ``rootfilesin:`` can be used, specifying an
+absolute path (relative to the repository root).
 
 To use an extended glob, start a name with ``glob:``. Globs are rooted
 at the current directory; a glob such as ``*.c`` will only match files
@@ -39,12 +42,15 @@
 All patterns, except for ``glob:`` specified in command line (not for
 ``-I`` or ``-X`` options), can match also against directories: files
 under matched directories are treated as matched.
+For ``-I`` and ``-X`` options, ``glob:`` will match directories recursively.
 
 Plain examples::
 
-  path:foo/bar   a name bar in a directory named foo in the root
-                 of the repository
-  path:path:name a file or directory named "path:name"
+  path:foo/bar        a name bar in a directory named foo in the root
+                      of the repository
+  path:path:name      a file or directory named "path:name"
+  rootfilesin:foo/bar the files in a directory called foo/bar, but not any files
+                      in its subdirectories and not a file bar in directory foo
 
 Glob examples::
 
@@ -52,6 +58,8 @@
   *.c            any name ending in ".c" in the current directory
   **.c           any name ending in ".c" in any subdirectory of the
                  current directory including itself.
+  foo/*          any file in directory foo plus all its subdirectories,
+                 recursively
   foo/*.c        any name ending in ".c" in the directory foo
   foo/**.c       any name ending in ".c" in any subdirectory of foo
                  including itself.
diff -r 72f25e17af9d -r 94264a6e6672 mercurial/match.py
--- a/mercurial/match.py Mon Feb 13 02:31:56 2017 -0800
+++ b/mercurial/match.py Mon Feb 13 15:39:29 2017 -0800
@@ -104,7 +104,10 @@
         a pattern is one of:
         'glob:<glob>' - a glob relative to cwd
         're:<regexp>' - a regular expression
-        'path:<path>' - a path relative to repository root
+        'path:<path>' - a path relative to repository root, which is matched
+                        recursively
+        'rootfilesin:<path>' - a path relative to repository root, which is
+                        matched non-recursively (will not match subdirectories)
         'relglob:<glob>' - an unrooted glob (*.c matches C files in all dirs)
         'relpath:<path>' - a path relative to cwd
         'relre:<regexp>' - a regexp that needn't match the start of a name
@@ -153,7 +156,7 @@
         elif patterns:
             kindpats = self._normalize(patterns, default, root, cwd, auditor)
             if not _kindpatsalwaysmatch(kindpats):
-                self._files = _roots(kindpats)
+                self._files = _explicitfiles(kindpats)
                 self._anypats = self._anypats or _anypats(kindpats)
                 self.patternspat, pm = _buildmatch(ctx, kindpats, '$',
                                                    listsubrepos, root)
@@ -286,7 +289,7 @@
         for kind, pat in [_patsplit(p, default) for p in patterns]:
             if kind in ('glob', 'relpath'):
                 pat = pathutil.canonpath(root, cwd, pat, auditor)
-            elif kind in ('relglob', 'path'):
+            elif kind in ('relglob', 'path', 'rootfilesin'):
                 pat = util.normpath(pat)
             elif kind in ('listfile', 'listfile0'):
                 try:
@@ -447,7 +450,8 @@
     if ':' in pattern:
         kind, pat = pattern.split(':', 1)
         if kind in ('re', 'glob', 'path', 'relglob', 'relpath', 'relre',
-                    'listfile', 'listfile0', 'set', 'include', 'subinclude'):
+                    'listfile', 'listfile0', 'set', 'include', 'subinclude',
+                    'rootfilesin'):
             return kind, pat
     return default, pattern
 
@@ -540,6 +544,14 @@
         if pat == '.':
             return ''
         return '^' + util.re.escape(pat) + '(?:/|$)'
+    if kind == 'rootfilesin':
+        if pat == '.':
+            escaped = ''
+        else:
+            # Pattern is a directory name.
+            escaped = util.re.escape(pat) + '/'
+        # Anything after the pattern must be a non-directory.
+        return '^' + escaped + '[^/]+$'
     if kind == 'relglob':
         return '(?:|.*/)' + _globre(pat) + globsuffix
     if kind == 'relpath':
@@ -614,6 +626,8 @@
 
     >>> _roots([('glob', 'g/*', ''), ('glob', 'g', ''), ('glob', 'g*', '')])
     ['g', 'g', '.']
+    >>> _roots([('rootfilesin', 'g', ''), ('rootfilesin', '', '')])
+    ['g', '.']
     >>> _roots([('relpath', 'r', ''), ('path', 'p/p', ''), ('path', '', '')])
     ['r', 'p/p', '.']
     >>> _roots([('relglob', 'rg*', ''), ('re', 're/', ''), ('relre', 'rr', '')])
@@ -628,15 +642,28 @@
                     break
                 root.append(p)
             r.append('/'.join(root) or '.')
-        elif kind in ('relpath', 'path'):
+        elif kind in ('relpath', 'path', 'rootfilesin'):
             r.append(pat or '.')
         else: # relglob, re, relre
             r.append('.')
     return r
 
+def _explicitfiles(kindpats):
+    '''Returns the potential explicit filenames from the patterns.
+
+    >>> _explicitfiles([('path', 'foo/bar', '')])
+    ['foo/bar']
+    >>> _explicitfiles([('rootfilesin', 'foo/bar', '')])
+    []
+    '''
+    # Keep only the pattern kinds where one can specify filenames (vs only
+    # directory names).
+    filable = [kp for kp in kindpats if kp[0] not in ('rootfilesin')]
+    return _roots(filable)
+
 def _anypats(kindpats):
     for kind, pat, source in kindpats:
-        if kind in ('glob', 're', 'relglob', 'relre', 'set'):
+        if kind in ('glob', 're', 'relglob', 'relre', 'set', 'rootfilesin'):
             return True
 
 _commentre = None
diff -r 72f25e17af9d -r 94264a6e6672 tests/test-walk.t
--- a/tests/test-walk.t Mon Feb 13 02:31:56 2017 -0800
+++ b/tests/test-walk.t Mon Feb 13 15:39:29 2017 -0800
@@ -112,6 +112,74 @@
   f  beans/navy      ../beans/navy
   f  beans/pinto     ../beans/pinto
   f  beans/turtle    ../beans/turtle
+
+  $ hg debugwalk 'rootfilesin:'
+  f  fennel      ../fennel
+  f  fenugreek   ../fenugreek
+  f  fiddlehead  ../fiddlehead
+  $ hg debugwalk -I 'rootfilesin:'
+  f  fennel      ../fennel
+  f  fenugreek   ../fenugreek
+  f  fiddlehead  ../fiddlehead
+  $ hg debugwalk 'rootfilesin:.'
+  f  fennel      ../fennel
+  f  fenugreek   ../fenugreek
+  f  fiddlehead  ../fiddlehead
+  $ hg debugwalk -I 'rootfilesin:.'
+  f  fennel      ../fennel
+  f  fenugreek   ../fenugreek
+  f  fiddlehead  ../fiddlehead
+  $ hg debugwalk -X 'rootfilesin:'
+  f  beans/black                     ../beans/black
+  f  beans/borlotti                  ../beans/borlotti
+  f  beans/kidney                    ../beans/kidney
+  f  beans/navy                      ../beans/navy
+  f  beans/pinto                     ../beans/pinto
+  f  beans/turtle                    ../beans/turtle
+  f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
+  f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
+  f  mammals/Procyonidae/raccoon     Procyonidae/raccoon
+  f  mammals/skunk                   skunk
+  $ hg debugwalk 'rootfilesin:fennel'
+  $ hg debugwalk -I 'rootfilesin:fennel'
+  $ hg debugwalk 'rootfilesin:skunk'
+  $ hg debugwalk -I 'rootfilesin:skunk'
+  $ hg debugwalk 'rootfilesin:beans'
+  f  beans/black     ../beans/black
+  f  beans/borlotti  ../beans/borlotti
+  f  beans/kidney    ../beans/kidney
+  f  beans/navy      ../beans/navy
+  f  beans/pinto     ../beans/pinto
+  f  beans/turtle    ../beans/turtle
+  $ hg debugwalk -I 'rootfilesin:beans'
+  f  beans/black     ../beans/black
+  f  beans/borlotti  ../beans/borlotti
+  f  beans/kidney    ../beans/kidney
+  f  beans/navy      ../beans/navy
+  f  beans/pinto     ../beans/pinto
+  f  beans/turtle    ../beans/turtle
+  $ hg debugwalk 'rootfilesin:mammals'
+  f  mammals/skunk  skunk
+  $ hg debugwalk -I 'rootfilesin:mammals'
+  f  mammals/skunk  skunk
+  $ hg debugwalk 'rootfilesin:mammals/'
+  f  mammals/skunk  skunk
+  $ hg debugwalk -I 'rootfilesin:mammals/'
+  f  mammals/skunk  skunk
+  $ hg debugwalk -X 'rootfilesin:mammals'
+  f  beans/black                     ../beans/black
+  f  beans/borlotti                  ../beans/borlotti
+  f  beans/kidney                    ../beans/kidney
+  f  beans/navy                      ../beans/navy
+  f  beans/pinto                     ../beans/pinto
+  f  beans/turtle                    ../beans/turtle
+  f  fennel                          ../fennel
+  f  fenugreek                       ../fenugreek
+  f  fiddlehead                      ../fiddlehead
+  f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
+  f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
+  f  mammals/Procyonidae/raccoon     Procyonidae/raccoon
+
   $ hg debugwalk .
   f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
   f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH 2 of 2 v2] match: making visitdir() deal with non-recursive entries

Martin von Zweigbergk via Mercurial-devel
# HG changeset patch
# User Rodrigo Damazio Bovendorp <[hidden email]>
# Date 1487034194 28800
#      Mon Feb 13 17:03:14 2017 -0800
# Node ID e90de197586d0749e64cef752613e6fe41d1c8e3
# Parent  94264a6e6672c917d42518f7ae9322445868d067
match: making visitdir() deal with non-recursive entries

Primarily as an optimization to avoid recursing into directories that will
never have a match inside, this classifies each matcher pattern's root as
recursive or non-recursive (erring on the side of keeping it recursive,
which may lead to wasteful directory or manifest walks that yield no matches).

I measured the performance of "rootfilesin" in two repos:
- The Firefox repo with tree manifests, with
  "hg files -r . -I rootfilesin:browser".
  The browser directory contains about 3K files across 249 subdirectories.
- A specific Google-internal directory which contains 75K files across 19K
  subdirectories, with "hg files -r . -I rootfilesin:REDACTED".

I tested with both cold and warm disk caches. Cold cache was produced by
running "sync; echo 3 > /proc/sys/vm/drop_caches". Warm cache was produced
by re-running the same command a few times.

These were the results:

               Cold cache           Warm cache
             Before   After      Before   After
firefox      0m5.1s   0m2.18s   0m0.22s  0m0.14s
google3 dir  2m3.9s   0m1.57s   0m8.12s  0m0.16s

Certain extensions, notably narrowhg, can depend on this for correctness
(not trying to recurse into directories for which it has no information).

diff -r 94264a6e6672 -r e90de197586d mercurial/match.py
--- a/mercurial/match.py Mon Feb 13 15:39:29 2017 -0800
+++ b/mercurial/match.py Mon Feb 13 17:03:14 2017 -0800
@@ -125,9 +125,12 @@
         self._always = False
         self._pathrestricted = bool(include or exclude or patterns)
         self._warn = warn
+
+        # roots are directories which are recursively included/excluded.
         self._includeroots = set()
+        self._excluderoots = set()
+        # dirs are directories which are non-recursively included.
         self._includedirs = set(['.'])
-        self._excluderoots = set()
 
         if badfn is not None:
             self.bad = badfn
@@ -137,14 +140,20 @@
             kindpats = self._normalize(include, 'glob', root, cwd, auditor)
             self.includepat, im = _buildmatch(ctx, kindpats, '(?:/|$)',
                                               listsubrepos, root)
-            self._includeroots.update(_roots(kindpats))
-            self._includedirs.update(util.dirs(self._includeroots))
+            roots, dirs = _rootsanddirs(kindpats)
+            self._includeroots.update(roots)
+            self._includedirs.update(dirs)
             matchfns.append(im)
         if exclude:
             kindpats = self._normalize(exclude, 'glob', root, cwd, auditor)
             self.excludepat, em = _buildmatch(ctx, kindpats, '(?:/|$)',
                                               listsubrepos, root)
             if not _anypats(kindpats):
+                # Only consider recursive excludes as such - if a non-recursive
+                # exclude is used, we must still recurse into the excluded
+                # directory, at least to find subdirectories. In such a case,
+                # the regex still won't match the non-recursively-excluded
+                # files.
                 self._excluderoots.update(_roots(kindpats))
             matchfns.append(lambda f: not em(f))
         if exact:
@@ -241,7 +250,7 @@
             return 'all'
         if dir in self._excluderoots:
             return False
-        if (self._includeroots and
+        if ((self._includeroots or self._includedirs != set(['.'])) and
             '.' not in self._includeroots and
             dir not in self._includeroots and
             dir not in self._includedirs and
@@ -422,7 +431,9 @@
         # m.exact(file) must be based off of the actual user input, otherwise
         # inexact case matches are treated as exact, and not noted without -v.
         if self._files:
-            self._fileroots = set(_roots(self._kp))
+            roots, dirs = _rootsanddirs(self._kp)
+            self._fileroots = set(roots)
+            self._fileroots.update(dirs)
 
     def _normalize(self, patterns, default, root, cwd, auditor):
         self._kp = super(icasefsmatcher, self)._normalize(patterns, default,
@@ -621,19 +632,16 @@
                     raise error.Abort(_("invalid pattern (%s): %s") % (k, p))
         raise error.Abort(_("invalid pattern"))
 
-def _roots(kindpats):
-    '''return roots and exact explicitly listed files from patterns
-
-    >>> _roots([('glob', 'g/*', ''), ('glob', 'g', ''), ('glob', 'g*', '')])
-    ['g', 'g', '.']
-    >>> _roots([('rootfilesin', 'g', ''), ('rootfilesin', '', '')])
-    ['g', '.']
-    >>> _roots([('relpath', 'r', ''), ('path', 'p/p', ''), ('path', '', '')])
-    ['r', 'p/p', '.']
-    >>> _roots([('relglob', 'rg*', ''), ('re', 're/', ''), ('relre', 'rr', '')])
-    ['.', '.', '.']
+def _patternrootsanddirs(kindpats):
+    '''Returns roots and directories corresponding to each pattern.
+
+    This calculates the roots and directories exactly matching the patterns and
+    returns a tuple of (roots, dirs) for each. It does not return other
+    directories which may also need to be considered, like the parent
+    directories.
     '''
     r = []
+    d = []
     for kind, pat, source in kindpats:
         if kind == 'glob': # find the non-glob prefix
             root = []
@@ -642,11 +650,48 @@
                     break
                 root.append(p)
             r.append('/'.join(root) or '.')
-        elif kind in ('relpath', 'path', 'rootfilesin'):
+        elif kind in ('relpath', 'path'):
             r.append(pat or '.')
+        elif kind in ('rootfilesin'):
+            d.append(pat or '.')
         else: # relglob, re, relre
             r.append('.')
-    return r
+    return r, d
+
+def _roots(kindpats):
+    '''Returns root directories to match recursively from the given patterns.'''
+    roots, dirs = _patternrootsanddirs(kindpats)
+    return roots
+
+def _rootsanddirs(kindpats):
+    '''Returns roots and exact directories from patterns.
+
+    roots are directories to match recursively, whereas exact directories should
+    be matched non-recursively. The returned (roots, dirs) tuple will also
+    include directories that need to be implicitly considered as either, such as
+    parent directories.
+
+    >>> _rootsanddirs(\
+        [('glob', 'g/h/*', ''), ('glob', 'g/h', ''), ('glob', 'g*', '')])
+    (['g/h', 'g/h', '.'], ['g'])
+    >>> _rootsanddirs(\
+        [('rootfilesin', 'g/h', ''), ('rootfilesin', '', '')])
+    ([], ['g/h', '.', 'g'])
+    >>> _rootsanddirs(\
+        [('relpath', 'r', ''), ('path', 'p/p', ''), ('path', '', '')])
+    (['r', 'p/p', '.'], ['p'])
+    >>> _rootsanddirs(\
+        [('relglob', 'rg*', ''), ('re', 're/', ''), ('relre', 'rr', '')])
+    (['.', '.', '.'], [])
+    '''
+    r, d = _patternrootsanddirs(kindpats)
+
+    # Append the parents as non-recursive/exact directories, since they must be
+    # scanned to get to either the roots or the other exact directories.
+    d.extend(util.dirs(d))
+    d.extend(util.dirs(r))
+
+    return r, d
 
 def _explicitfiles(kindpats):
     '''Returns the potential explicit filenames from the patterns.
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH 1 of 2 v2] match: adding support for matching files inside a directory

Martin von Zweigbergk via Mercurial-devel
In reply to this post by Martin von Zweigbergk via Mercurial-devel
Foozy, how does this version of the series look to you?

Yuya, since it is in Google's interest to get this in, I'm reluctant
to queue it myself. Would you be able to do that (if it looks good to
you, of course)? Thanks.


On Mon, Feb 13, 2017 at 5:05 PM, Rodrigo Damazio Bovendorp
<[hidden email]> wrote:

> # HG changeset patch
> # User Rodrigo Damazio Bovendorp <[hidden email]>
> # Date 1487029169 28800
> #      Mon Feb 13 15:39:29 2017 -0800
> # Node ID 94264a6e6672c917d42518f7ae9322445868d067
> # Parent  72f25e17af9d6a206ea374c30f229ae9513f3f23
> match: adding support for matching files inside a directory
>
> This adds a new "rootfilesin" matcher type which matches files inside a
> directory, but not any subdirectories (so it matches non-recursively).
> This has the "root" prefix per foozy's plan for other matchers (rootglob,
> rootpath, cwdre, etc.).
>
> diff -r 72f25e17af9d -r 94264a6e6672 mercurial/help/patterns.txt
> --- a/mercurial/help/patterns.txt       Mon Feb 13 02:31:56 2017 -0800
> +++ b/mercurial/help/patterns.txt       Mon Feb 13 15:39:29 2017 -0800
> @@ -13,7 +13,10 @@
>
>  To use a plain path name without any pattern matching, start it with
>  ``path:``. These path names must completely match starting at the
> -current repository root.
> +current repository root, and when the path points to a directory, it is matched
> +recursively. To match all files in a directory non-recursively (not including
> +any files in subdirectories), ``rootfilesin:`` can be used, specifying an
> +absolute path (relative to the repository root).
>
>  To use an extended glob, start a name with ``glob:``. Globs are rooted
>  at the current directory; a glob such as ``*.c`` will only match files
> @@ -39,12 +42,15 @@
>  All patterns, except for ``glob:`` specified in command line (not for
>  ``-I`` or ``-X`` options), can match also against directories: files
>  under matched directories are treated as matched.
> +For ``-I`` and ``-X`` options, ``glob:`` will match directories recursively.
>
>  Plain examples::
>
> -  path:foo/bar   a name bar in a directory named foo in the root
> -                 of the repository
> -  path:path:name a file or directory named "path:name"
> +  path:foo/bar        a name bar in a directory named foo in the root
> +                      of the repository
> +  path:path:name      a file or directory named "path:name"
> +  rootfilesin:foo/bar the files in a directory called foo/bar, but not any files
> +                      in its subdirectories and not a file bar in directory foo
>
>  Glob examples::
>
> @@ -52,6 +58,8 @@
>    *.c            any name ending in ".c" in the current directory
>    **.c           any name ending in ".c" in any subdirectory of the
>                   current directory including itself.
> +  foo/*          any file in directory foo plus all its subdirectories,
> +                 recursively
>    foo/*.c        any name ending in ".c" in the directory foo
>    foo/**.c       any name ending in ".c" in any subdirectory of foo
>                   including itself.
> diff -r 72f25e17af9d -r 94264a6e6672 mercurial/match.py
> --- a/mercurial/match.py        Mon Feb 13 02:31:56 2017 -0800
> +++ b/mercurial/match.py        Mon Feb 13 15:39:29 2017 -0800
> @@ -104,7 +104,10 @@
>          a pattern is one of:
>          'glob:<glob>' - a glob relative to cwd
>          're:<regexp>' - a regular expression
> -        'path:<path>' - a path relative to repository root
> +        'path:<path>' - a path relative to repository root, which is matched
> +                        recursively
> +        'rootfilesin:<path>' - a path relative to repository root, which is
> +                        matched non-recursively (will not match subdirectories)
>          'relglob:<glob>' - an unrooted glob (*.c matches C files in all dirs)
>          'relpath:<path>' - a path relative to cwd
>          'relre:<regexp>' - a regexp that needn't match the start of a name
> @@ -153,7 +156,7 @@
>          elif patterns:
>              kindpats = self._normalize(patterns, default, root, cwd, auditor)
>              if not _kindpatsalwaysmatch(kindpats):
> -                self._files = _roots(kindpats)
> +                self._files = _explicitfiles(kindpats)
>                  self._anypats = self._anypats or _anypats(kindpats)
>                  self.patternspat, pm = _buildmatch(ctx, kindpats, '$',
>                                                     listsubrepos, root)
> @@ -286,7 +289,7 @@
>          for kind, pat in [_patsplit(p, default) for p in patterns]:
>              if kind in ('glob', 'relpath'):
>                  pat = pathutil.canonpath(root, cwd, pat, auditor)
> -            elif kind in ('relglob', 'path'):
> +            elif kind in ('relglob', 'path', 'rootfilesin'):
>                  pat = util.normpath(pat)
>              elif kind in ('listfile', 'listfile0'):
>                  try:
> @@ -447,7 +450,8 @@
>      if ':' in pattern:
>          kind, pat = pattern.split(':', 1)
>          if kind in ('re', 'glob', 'path', 'relglob', 'relpath', 'relre',
> -                    'listfile', 'listfile0', 'set', 'include', 'subinclude'):
> +                    'listfile', 'listfile0', 'set', 'include', 'subinclude',
> +                    'rootfilesin'):
>              return kind, pat
>      return default, pattern
>
> @@ -540,6 +544,14 @@
>          if pat == '.':
>              return ''
>          return '^' + util.re.escape(pat) + '(?:/|$)'
> +    if kind == 'rootfilesin':
> +        if pat == '.':
> +            escaped = ''
> +        else:
> +            # Pattern is a directory name.
> +            escaped = util.re.escape(pat) + '/'
> +        # Anything after the pattern must be a non-directory.
> +        return '^' + escaped + '[^/]+$'
>      if kind == 'relglob':
>          return '(?:|.*/)' + _globre(pat) + globsuffix
>      if kind == 'relpath':
> @@ -614,6 +626,8 @@
>
>      >>> _roots([('glob', 'g/*', ''), ('glob', 'g', ''), ('glob', 'g*', '')])
>      ['g', 'g', '.']
> +    >>> _roots([('rootfilesin', 'g', ''), ('rootfilesin', '', '')])
> +    ['g', '.']
>      >>> _roots([('relpath', 'r', ''), ('path', 'p/p', ''), ('path', '', '')])
>      ['r', 'p/p', '.']
>      >>> _roots([('relglob', 'rg*', ''), ('re', 're/', ''), ('relre', 'rr', '')])
> @@ -628,15 +642,28 @@
>                      break
>                  root.append(p)
>              r.append('/'.join(root) or '.')
> -        elif kind in ('relpath', 'path'):
> +        elif kind in ('relpath', 'path', 'rootfilesin'):
>              r.append(pat or '.')
>          else: # relglob, re, relre
>              r.append('.')
>      return r
>
> +def _explicitfiles(kindpats):
> +    '''Returns the potential explicit filenames from the patterns.
> +
> +    >>> _explicitfiles([('path', 'foo/bar', '')])
> +    ['foo/bar']
> +    >>> _explicitfiles([('rootfilesin', 'foo/bar', '')])
> +    []
> +    '''
> +    # Keep only the pattern kinds where one can specify filenames (vs only
> +    # directory names).
> +    filable = [kp for kp in kindpats if kp[0] not in ('rootfilesin')]
> +    return _roots(filable)
> +
>  def _anypats(kindpats):
>      for kind, pat, source in kindpats:
> -        if kind in ('glob', 're', 'relglob', 'relre', 'set'):
> +        if kind in ('glob', 're', 'relglob', 'relre', 'set', 'rootfilesin'):
>              return True
>
>  _commentre = None
> diff -r 72f25e17af9d -r 94264a6e6672 tests/test-walk.t
> --- a/tests/test-walk.t Mon Feb 13 02:31:56 2017 -0800
> +++ b/tests/test-walk.t Mon Feb 13 15:39:29 2017 -0800
> @@ -112,6 +112,74 @@
>    f  beans/navy      ../beans/navy
>    f  beans/pinto     ../beans/pinto
>    f  beans/turtle    ../beans/turtle
> +
> +  $ hg debugwalk 'rootfilesin:'
> +  f  fennel      ../fennel
> +  f  fenugreek   ../fenugreek
> +  f  fiddlehead  ../fiddlehead
> +  $ hg debugwalk -I 'rootfilesin:'
> +  f  fennel      ../fennel
> +  f  fenugreek   ../fenugreek
> +  f  fiddlehead  ../fiddlehead
> +  $ hg debugwalk 'rootfilesin:.'
> +  f  fennel      ../fennel
> +  f  fenugreek   ../fenugreek
> +  f  fiddlehead  ../fiddlehead
> +  $ hg debugwalk -I 'rootfilesin:.'
> +  f  fennel      ../fennel
> +  f  fenugreek   ../fenugreek
> +  f  fiddlehead  ../fiddlehead
> +  $ hg debugwalk -X 'rootfilesin:'
> +  f  beans/black                     ../beans/black
> +  f  beans/borlotti                  ../beans/borlotti
> +  f  beans/kidney                    ../beans/kidney
> +  f  beans/navy                      ../beans/navy
> +  f  beans/pinto                     ../beans/pinto
> +  f  beans/turtle                    ../beans/turtle
> +  f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
> +  f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
> +  f  mammals/Procyonidae/raccoon     Procyonidae/raccoon
> +  f  mammals/skunk                   skunk
> +  $ hg debugwalk 'rootfilesin:fennel'
> +  $ hg debugwalk -I 'rootfilesin:fennel'
> +  $ hg debugwalk 'rootfilesin:skunk'
> +  $ hg debugwalk -I 'rootfilesin:skunk'
> +  $ hg debugwalk 'rootfilesin:beans'
> +  f  beans/black     ../beans/black
> +  f  beans/borlotti  ../beans/borlotti
> +  f  beans/kidney    ../beans/kidney
> +  f  beans/navy      ../beans/navy
> +  f  beans/pinto     ../beans/pinto
> +  f  beans/turtle    ../beans/turtle
> +  $ hg debugwalk -I 'rootfilesin:beans'
> +  f  beans/black     ../beans/black
> +  f  beans/borlotti  ../beans/borlotti
> +  f  beans/kidney    ../beans/kidney
> +  f  beans/navy      ../beans/navy
> +  f  beans/pinto     ../beans/pinto
> +  f  beans/turtle    ../beans/turtle
> +  $ hg debugwalk 'rootfilesin:mammals'
> +  f  mammals/skunk  skunk
> +  $ hg debugwalk -I 'rootfilesin:mammals'
> +  f  mammals/skunk  skunk
> +  $ hg debugwalk 'rootfilesin:mammals/'
> +  f  mammals/skunk  skunk
> +  $ hg debugwalk -I 'rootfilesin:mammals/'
> +  f  mammals/skunk  skunk
> +  $ hg debugwalk -X 'rootfilesin:mammals'
> +  f  beans/black                     ../beans/black
> +  f  beans/borlotti                  ../beans/borlotti
> +  f  beans/kidney                    ../beans/kidney
> +  f  beans/navy                      ../beans/navy
> +  f  beans/pinto                     ../beans/pinto
> +  f  beans/turtle                    ../beans/turtle
> +  f  fennel                          ../fennel
> +  f  fenugreek                       ../fenugreek
> +  f  fiddlehead                      ../fiddlehead
> +  f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
> +  f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
> +  f  mammals/Procyonidae/raccoon     Procyonidae/raccoon
> +
>    $ hg debugwalk .
>    f  mammals/Procyonidae/cacomistle  Procyonidae/cacomistle
>    f  mammals/Procyonidae/coatimundi  Procyonidae/coatimundi
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH 1 of 2 v2] match: adding support for matching files inside a directory

Yuya Nishihara
In reply to this post by Martin von Zweigbergk via Mercurial-devel
On Mon, 13 Feb 2017 17:05:23 -0800, Rodrigo Damazio Bovendorp via Mercurial-devel wrote:
> # HG changeset patch
> # User Rodrigo Damazio Bovendorp <[hidden email]>
> # Date 1487029169 28800
> #      Mon Feb 13 15:39:29 2017 -0800
> # Node ID 94264a6e6672c917d42518f7ae9322445868d067
> # Parent  72f25e17af9d6a206ea374c30f229ae9513f3f23
> match: adding support for matching files inside a directory

Looks good per foozy's comments on V1, queued. Thanks for the hard work on
consistent pattern naming.

> +def _explicitfiles(kindpats):
> +    '''Returns the potential explicit filenames from the patterns.
> +
> +    >>> _explicitfiles([('path', 'foo/bar', '')])
> +    ['foo/bar']
> +    >>> _explicitfiles([('rootfilesin', 'foo/bar', '')])
> +    []
> +    '''
> +    # Keep only the pattern kinds where one can specify filenames (vs only
> +    # directory names).
> +    filable = [kp for kp in kindpats if kp[0] not in ('rootfilesin')]
                                                        ^^^^^^^^^^^^^^^

Fixed this as "kp[0] not in ('rootfilesin',)".
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH 2 of 2 v2] match: making visitdir() deal with non-recursive entries

Yuya Nishihara
In reply to this post by Martin von Zweigbergk via Mercurial-devel
On Mon, 13 Feb 2017 17:05:24 -0800, Rodrigo Damazio Bovendorp via Mercurial-devel wrote:
> # HG changeset patch
> # User Rodrigo Damazio Bovendorp <[hidden email]>
> # Date 1487034194 28800
> #      Mon Feb 13 17:03:14 2017 -0800
> # Node ID e90de197586d0749e64cef752613e6fe41d1c8e3
> # Parent  94264a6e6672c917d42518f7ae9322445868d067
> match: making visitdir() deal with non-recursive entries

> @@ -241,7 +250,7 @@
>              return 'all'
>          if dir in self._excluderoots:
>              return False
> -        if (self._includeroots and
> +        if ((self._includeroots or self._includedirs != set(['.'])) and
>              '.' not in self._includeroots and
>              dir not in self._includeroots and
>              dir not in self._includedirs and

Maybe we'll need to distinguish the explicitly-set 'rootfilesin:.' from
the default value. Since visitdir() is for optimization, this shouldn't be
a blocker, so queued.

> @@ -642,11 +650,48 @@
>                      break
>                  root.append(p)
>              r.append('/'.join(root) or '.')
> -        elif kind in ('relpath', 'path', 'rootfilesin'):
> +        elif kind in ('relpath', 'path'):
>              r.append(pat or '.')
> +        elif kind in ('rootfilesin'):
                        ^^^^^^^^^^^^^^^

Fixed this as well.
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH 2 of 2 v2] match: making visitdir() deal with non-recursive entries

Martin von Zweigbergk via Mercurial-devel


On Feb 17, 2017 10:17 PM, "Yuya Nishihara" <[hidden email]> wrote:
On Mon, 13 Feb 2017 17:05:24 -0800, Rodrigo Damazio Bovendorp via Mercurial-devel wrote:
> # HG changeset patch
> # User Rodrigo Damazio Bovendorp <[hidden email]>
> # Date 1487034194 28800
> #      Mon Feb 13 17:03:14 2017 -0800
> # Node ID e90de197586d0749e64cef752613e6fe41d1c8e3
> # Parent  94264a6e6672c917d42518f7ae9322445868d067
> match: making visitdir() deal with non-recursive entries

> @@ -241,7 +250,7 @@
>              return 'all'
>          if dir in self._excluderoots:
>              return False
> -        if (self._includeroots and
> +        if ((self._includeroots or self._includedirs != set(['.'])) and
>              '.' not in self._includeroots and
>              dir not in self._includeroots and
>              dir not in self._includedirs and

Maybe we'll need to distinguish the explicitly-set 'rootfilesin:.' from
the default value. Since visitdir() is for optimization, this shouldn't be
a blocker, so queued.

Good point, rootfilesin:. doesn't currently get optimized, it seems.

Unrelated to this patch is that the optimization only applies to -I, not if specified outside of that. I remember working on fixing that two years ago, but I don't remember how far I got. Seems like it should be possible.

It would also be nice if the dirstate walk could take advantage of the visitdir logic.


> @@ -642,11 +650,48 @@
>                      break
>                  root.append(p)
>              r.append('/'.join(root) or '.')
> -        elif kind in ('relpath', 'path', 'rootfilesin'):
> +        elif kind in ('relpath', 'path'):
>              r.append(pat or '.')
> +        elif kind in ('rootfilesin'):
                        ^^^^^^^^^^^^^^^

Fixed this as well.
_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


_______________________________________________
Mercurial-devel mailing list
[hidden email]
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Loading...