Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing Cygwin read mode in 4,2 #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

john-peterson
Copy link
Member

References

http://sourceware.org/bugzilla/show_bug.cgi?id=14597

Purpose

Removing the \r\n → \n conversion.

The goal is to have the same behavior in Cygwin as in Linux.

Discussion

Reasons to disable LF ↔ CRLF conversion by default

  • It's off by default in Linux and script writers will sometimes not consider Cygwin compatibility.
  • There are Windows editors that support LF newlines.
  • The conversion is probably less commonly desired than not desired.

Test case

Previous version

printf "\r\n"|sed -e's, , ,'|cat -A -
$

This version

printf "\r\n"|./sed -e's, , ,'|cat -A -
^M$

Additional notes

Conversions done by text read and write

  • rt \r\n\n
  • wt \n\r\n

t in rt/wt is equivalent to O_TEXT

Removing the \r\n → \n conversion.

The goal is to have the same behavior in Cygwin as in Linux.
MXEBot pushed a commit that referenced this pull request Jul 5, 2016
Previously sed would parse multibyte characters incorrectly in two scenarios:

1. Slash following an incomplete-yet-valid multibyte sequence (match_slash):
    $ LC_ALL=en_US.UTF-8 sed $'s/\316/X/'
    sed: -e expression #1, char 6: unterminated `s' command

2. Open/close brackets as part of a valid mutilbyte string inside a character
class (snarf_char_class). In the example below, '\203]' is a valid
multibyte character in SHIFT-JIS locale:

    $ LC_ALL=ja_JP.shiftjis sed $'/[\203]/]/p'
    sed: -e expression #1, char #5: Unmatched [ or [^

Both cases stem from mbcs.c:brlen() being non-intuitive:
It returned 1 for valid single-byte character, invalid multibyte-character,
and a for the last byte of a valid multibyte sequence - making it
non-trivial to use correctly.

This commit replaces brlen() with a simpler is_mb_char() function:
returns non-zero for multibyte sequences, zero for single/invalid sequences.

* sed/sed.h: (BRLEN, brlen): Remove delaration.
(IS_MB_CHAR,is_mb_char): Add macro and function declaration.
* sed/mbcs.c: (brlen): Remove function. (is_mb_char): New function.
* sed/compile.c: (snarf_char_class, match_slash): Use IS_MB_CHAR instead of
BRLEN; Adjust local variables accordingly.
* testsuite/mb-match-slash.sh: New test for scenario 1.
* testsuite/mb-charclass-non-utf8.sh: New test for scenario 2,
requires SHIFT-JIS locale.
* testsuite/Makefile.am: Add new tests
* testsuite/init.cfg: (require_ja_shiftjis_locale_): New function.
* NEWS: Mention bug fix.
MXEBot pushed a commit that referenced this pull request Mar 25, 2017
sed-4.4 and earlier rejected the following:
  $ sed 'y/1/a/ #f'
  sed: -e expression #1, char 8: extra characters after command

See https://bugs.gnu.org/22460 .

* sed/compile.c (compile_program): Handle comments, braces after 'y' command.
* NEWS: Mention it.
MXEBot pushed a commit that referenced this pull request Oct 20, 2018
w/W (and s///w) commands Without a filename would print a confusing error
message:
  $ sed w
  sed: couldn't open file : No such file or directory
While r/R commands with empty file name were a silent no-op.

With this change, sed programs with empty filename are rejected with a
clear error:

  $ sed 's/1/2/w'
  sed: -e expression #1, char 7: missing filename in r/R/w/W commands
  $ sed r
  sed: -e expression #1, char 1: missing filename in r/R/w/W commands

* NEWS: Mention change.
* sed/compile.c (get_openfile): Exit with an error message if filename
is missing. (compile_program): Same for 'r' command code.
* testsuite/missing-filename.sh: New test.
* testsuite/local.mk (T): Add new test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant