Shell File Manager

Current Path : /proc/self/cwd/sym/root/usr/share/doc/perl-Regexp-Assemble-0.35/
Current File : //proc/self/cwd/sym/root/usr/share/doc/perl-Regexp-Assemble-0.35/TODO
1. Tree equivalencies. Currently, /contend/ /content/ /resend/ /resent/
produces (?:conten[dt]|resend[dt]) but it is possible to produce
(?:cont|res)en[dt] if one can spot the common tail nodes (and walk back
the equivalent paths). Or be by me my => /[bm][ey]/ in the simplest case.

To do this requires a certain amount of restructuring of the code.
Currently, the algorithm uses a two-phase approach. In the first
phase, the trie is traversed and reductions are performed. In the
second phase, the reduced trie is traversed and the pattern is
emitted.

What has to occur is that the reduction and emission have to occur
together. As a node is completed, it is replaced by its string
representation. This then allows child nodes to be compared for
equality with a simple 'eq'. Since there is only a single traversal,
the overall generation time might drop, even though the context
baggage required to delve through the tree will be more expensive
to carry along (a hash rather than a couple of scalars).

Actually, a simpler approach is to take on a secret sentinel
atom at the end of every pattern, which gives the reduction
algorithm sufficient traction to create a perfect trie.

I'm rewriting the reduction code using this technique.

2. Investigate how (?>foo) works. Can it be applied?

5. How can a tracked pattern be serialised? (Add freeze and thaw methods).

6. Store callbacks per tracked pattern.

12. utf-8... hmmmm...

14. Adding qr//'ed patterns. For example, consider
    $r->add ( qr/^abc/i )
        ->add( qr/^abd/ )
        ->add( qr/^ab e/x );
    this should admit abc abC aBc aBC abd abe as matches

16. Allow a fast, unsafe tracking mode, that can be used if a(?bc)?
    can't happen. (Possibly carp if it does appear during traversal)?

17. given a-\d+-\d+-\d+-\d+-b, produce a(?:-\d+){4}-b. Something
    along the lines of (.{4))(\1+) would let the regexp engine
    itself be brought to bear on the matter, which is a rather
    appealing idea. Consider

      while(/(?!\+)(\S{2,}?)(\1+)/g) { ... $1, $2 ... }

    as a starting point.

19. The reduction code has become unbelievably baroque. Adding code
    to handle (sting,singing,sing) => s(?:(?:ing)?|t)ing was far
    too difficult. Adding more stuff just breaks existing behaviour.
    And fixing the ^abcd$ ... bug broke stuff all over again.
    Now that the corner cases are more clearly identified, a full
    rewrite of the reduction code is needed. And would admit the
    possibility of implementing items 1 and 17.

20. Handle debug unrev with a separate bit

23. Japhy's http://www.perlmonks.org/index.pl?node_id=90876 list2range
    regexp

24. Lookahead assertions contain serious bugs (as shown by
    assembling powersets. Need to save more context during reduction,
    which in turn will simplify the preparation of the lookahead
    classes. See also 19.

26. _lex() swamps the overall run-time. It stems from the decision
    to use a single regexp to pull apart any pattern. A suite of
    simpler regexp to pick of parens, char classes, quantifiers
    and bare tokens may be faster. (This has been implemented as
	_fastlex(), but it's only marginally faster. Perhaps split-by-
	char and lex a la C?

27. We don't, as yet, unroll_plus a paren e.g. (abc)+?

28. We don't reroll unrolled a a* to a+ in indented or tracked
    output

29. Use (*MARK n) in blead for tracked patterns, and use (*FAIL) for
    the unmatchable pattern.
Shell File Manager Version 1.1, Coded By Shell
Email: [email protected]