Ticket #3250: mc-3250-viewer-rewrite-v1.patch

File mc-3250-viewer-rewrite-v1.patch, 64.1 KB (added by egmont, 10 years ago)

Reimplementation, v1

Line 
1diff --git a/AUTHORS b/AUTHORS
2index bb85c83..60ef7f7 100644
3--- a/AUTHORS
4+++ b/AUTHORS
5@@ -64,6 +64,7 @@ Egmont Koblinger <egmont@gmail.com>
6         Support of extended mouse clicks beyond 223 column
7         Support of bracketed paste mode of xterm
8                 (http://invisible-island.net/xterm/ctlseqs/ctlseqs.html#Bracketed%20Paste%20Mode)
9+        Rewritten viewer
10 
11 Erwin van Eijk <wabbit@corner.iaf.nl>
12 
13diff --git a/src/viewer/Makefile.am b/src/viewer/Makefile.am
14index 53bc7a4..0602084 100644
15--- a/src/viewer/Makefile.am
16+++ b/src/viewer/Makefile.am
17@@ -3,6 +3,7 @@ noinst_LTLIBRARIES = libmcviewer.la
18 
19 libmcviewer_la_SOURCES = \
20        actions_cmd.c \
21+       ascii.c \
22        coord_cache.c \
23        datasource.c \
24        dialogs.c \
25@@ -16,7 +17,6 @@ libmcviewer_la_SOURCES = \
26        mcviewer.h \
27        move.c \
28        nroff.c \
29-       plain.c \
30        search.c
31 
32 AM_CPPFLAGS = -I$(top_srcdir) $(GLIB_CFLAGS) $(PCRE_CPPFLAGS)
33diff --git a/src/viewer/actions_cmd.c b/src/viewer/actions_cmd.c
34index c4d52b7..44c70cd 100644
35--- a/src/viewer/actions_cmd.c
36+++ b/src/viewer/actions_cmd.c
37@@ -512,6 +512,8 @@ mcview_execute_cmd (mcview_t * view, unsigned long command)
38         break;
39     case CK_Bookmark:
40         view->dpy_start = view->marks[view->marker];
41+        view->dpy_paragraph_skip_lines = 0;  // TODO: remember this value in the marker???
42+        view->dpy_wrap_dirty = TRUE;
43         view->dirty++;
44         break;
45 #ifdef HAVE_CHARSET
46@@ -594,6 +596,7 @@ mcview_adjust_size (WDialog * h)
47     widget_set_size (WIDGET (view), 0, 0, LINES - 1, COLS);
48     widget_set_size (WIDGET (b), LINES - 1, 0, 1, COLS);
49 
50+    view->dpy_wrap_dirty = TRUE;
51     mcview_compute_areas (view);
52     mcview_update_bytes_per_line (view);
53 }
54diff --git a/src/viewer/ascii.c b/src/viewer/ascii.c
55new file mode 100644
56index 0000000..ea83ab1
57--- /dev/null
58+++ b/src/viewer/ascii.c
59@@ -0,0 +1,899 @@
60+/*
61+   Internal file viewer for the Midnight Commander
62+   Function for plain view
63+
64+   Copyright (C) 1994-2014
65+   Free Software Foundation, Inc.
66+
67+   Written by:
68+   Miguel de Icaza, 1994, 1995, 1998
69+   Janne Kukonlehto, 1994, 1995
70+   Jakub Jelinek, 1995
71+   Joseph M. Hinkle, 1996
72+   Norbert Warmuth, 1997
73+   Pavel Machek, 1998
74+   Roland Illig <roland.illig@gmx.de>, 2004, 2005
75+   Slava Zanko <slavazanko@google.com>, 2009
76+   Andrew Borodin <aborodin@vmail.ru>, 2009-2014
77+   Ilia Maslakov <il.smind@gmail.com>, 2009
78+   Rewritten almost from scratch by:
79+   Egmont Koblinger <egmont@gmail.com>, 2014
80+
81+   This file is part of the Midnight Commander.
82+
83+   The Midnight Commander is free software: you can redistribute it
84+   and/or modify it under the terms of the GNU General Public License as
85+   published by the Free Software Foundation, either version 3 of the License,
86+   or (at your option) any later version.
87+
88+   The Midnight Commander is distributed in the hope that it will be useful,
89+   but WITHOUT ANY WARRANTY; without even the implied warranty of
90+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
91+   GNU General Public License for more details.
92+
93+   You should have received a copy of the GNU General Public License
94+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
95+
96+   ------------------------------------------------------------------------------------------------
97+
98+   The viewer is implemented along the following design principles:
99+
100+   Terminology: a "paragraph" is the text between two adjacent newline characters. A "line" or
101+   "row" is a visual row on the screen. In wrap mode, the viewer formats a paragraph into one or
102+   more lines.
103+
104+   The parser-formatter is designed to be stateless across paragraphs. This is so that we can walk
105+   backwards without having to reparse the whole file (although we still need to reparse and
106+   reformat the whole paragraph, but it's a lot better).
107+
108+   The parser-formatter, however, needs to carry a state across lines. Currently this state
109+   contains:
110+
111+    - The logical column (as if we didn't wrap). This is used for handling TAB characters after a
112+      wordwrap consistently with less.
113+
114+    - Whether the last nroff character was bold or underlined. This is used for displaying the
115+      ambiguous _\b_ sequence consistently with less.
116+
117+    - Whether the desired way of displaying a lonely combining accent or spacing mark is to place
118+      it over a dotted circle (we do this at the beginning of the paragraph of after a TAB), or to
119+      ignore the combining char and show replacement char for the spacing mark (we do this if e.g.
120+      too many of these were encountered and hence we don't glue them with their base character).
121+
122+    - (This state needs to be expanded if e.g. we decide to print verbose replacement characters
123+      (e.g. "<U+0080>") and allow these to wrap around lines.)
124+
125+   The state also contains the file offset, as it doesn't make sense to ever
126+   know the state without knowing the corresponding offset.
127+
128+   The state depends on various settings (viewer width, encoding, nroff mode, charwrap or wordwrap
129+   mode (if we'll have that one day) etc., needs to be recomputed if any of these changes.
130+
131+   Walking forwards is usually relatively easy both in the file and on the screen. Walking
132+   backwards within a paragraph would only be possible in some special cases and even then it would
133+   be painful, so we always walk back to the beginning of the paragraph and reparse-reformat from
134+   there.
135+
136+   (Walking back within a line in the file would have at least the following difficulties: handling
137+   the parser state; processing invalid UTF-8; processing invalid nroff (e.g. what is "_\bA\bA"?).
138+   Walking back on the display: we wouldn't know where to display the last line of a paragraph, or
139+   where to display a line if its following line starts with a wide (CJK or Tab) character. Long
140+   story short: just forget this approach.)
141+
142+   Most important variables:
143+
144+    - dpy_start: Both in unwrap and wrap modes this points to the beginning of the topmost
145+      displayed paragraph.
146+
147+    - dpy_text_column: Only in unwrap mode, an additional horizontal scroll.
148+
149+    - dpy_paragraph_skip_lines: Only in wrap mode, an additional vertical scroll (the number of
150+      lines that are scrolled off at the top from the topmost paragraph).
151+
152+    - dpy_state_top: Only in wrap mode, the offset and parser-formatter state at the line where
153+      displaying the file begins is cached here.
154+
155+    - dpy_wrap_dirty: If some parameter has changed that makes it necessary to reparse-redisplay
156+      the topmost paragraph.
157+
158+   In wrap mode, the three variables "dpy_start", "dpy_paragraph_skip_lines" and "dpy_state_top"
159+   are kept consistent. Think of the first two as the ones describing the position, and the third
160+   as a cached value for better performance so that we don't need to wrap the invisible beginning
161+   of the topmost paragraph over and over again. The third value needs to be recomputed each time a
162+   parameter that influences parsing or displaying the file (e.g. width of screen, encoding, nroff
163+   mode) changes, this is signaled by "dpy_wrap_dirty" to force recomputing "dpy_state_top" (and
164+   clamp "dpy_paragraph_skip_lines" if necessary).
165+
166+   blahblahblah...
167+
168+   Goals: Always display simple scripts, double wide (CJK), combining accents and spacing marks
169+   (often used e.g. in Devanagari) perfectly. Make the arrow keys always work correctly.
170+
171+   Absolutely non-goal: RTL.
172+
173+   ------------------------------------------------------------------------------------------------
174+
175+   Help integration
176+
177+   I'm planning to port the help viewer to this codebase.
178+
179+   Splitting at sections would still happen in the help viewer. It would either copy a section, or
180+   set force_max and a similar force_min to limit displaying to one section only.
181+
182+   Parsing the help format would go next to the nroff parser. The colors, alternate character set,
183+   and emitting the version number would go to the "state". (The version number would be
184+   implemented by emitting remaining characters of a buffer in the "state" one by one, without
185+   advancing in the file position.)
186+
187+   The active link would be drawn similarly to the search highlight. Other than that, the viewer
188+   wouldn't care about links (except for their color). help.c would keep track of which one is
189+   highlighted, how to advance to the next/prev on an arrow, how the scroll offset needs to be
190+   adjusted when moving, etc.
191+
192+   Add wrapping at word boundaries to where wrapping at char boundaries happen now.
193+ */
194+
195+#include <config.h>
196+#include <wchar.h>   // shouldn't depend on this
197+
198+#include "lib/global.h"
199+#include "lib/tty/tty.h"
200+#include "lib/skin.h"
201+#include "lib/util.h"           /* is_printable() */
202+#ifdef HAVE_CHARSET
203+#include "lib/charsets.h"
204+#endif
205+
206+#include "src/setup.h"          /* option_tab_spacing */
207+
208+#include "internal.h"
209+
210+/*** global variables ****************************************************************************/
211+
212+/*** file scope macro definitions ****************************************************************/
213+
214+#define BASE_CHARACTER_FOR_LONELY_COMBINING 0x25CC  /* dotted circle */
215+#define MAX_COMBINING_CHARS 4  /* both slang and ncurses support exactly 4 */
216+
217+// I think space looks better than arrows. Still, use arrows while developing for better debugging.
218+// Final version could take it from the skin.
219+#define PARTIAL_CJK_AT_LEFT_MARGIN  0x25C2  // ' '  /* '<' doesn't look that good */
220+#define PARTIAL_CJK_AT_RIGHT_MARGIN 0x25B8  // ' '  /* '>' doesn't look that good */
221+
222+/*
223+ * Wrap mode: This is for safety so that jumping to the end of file (which already includes
224+ * scrolling back by a page) and then walking backwards is reasonably fast, even if the file is
225+ * extremely large and consists of maybe full zeros or something like that. If there's no newline
226+ * found within this limit, just start displaying from there and see what happens. We might get
227+ * some displaying parameteres (most importantly the columns) incorrect, but at least will show the
228+ * file without spinning the CPU for ages. When scrolling back to that point, the user might see a
229+ * garbled first line (even starting with an invalid partial UTF-8), but then walking back by yet
230+ * another line should fix it.
231+ *
232+ * Unwrap mode: This is not used, we wouldn't be able to do anything reasonable without walking
233+ * back a whole paragraph (well, view->data_area.height paragraphs actually).
234+ */
235+#define MAX_BACKWARDS_WALK_IN_PARAGRAPH (100 * 1000)
236+
237+/*** file scope type declarations ****************************************************************/
238+
239+/*** file scope variables ************************************************************************/
240+
241+/*** file scope functions ************************************************************************/
242+
243+// TODO: These methods shouldn't be necessary, see ticket 3257
244+
245+static int
246+mcview_wcwidth (const mcview_t * view, int c)
247+{
248+#ifdef HAVE_CHARSET
249+       if (view->utf8) {
250+               if (g_unichar_iswide(c))
251+                       return 2;
252+               if (g_unichar_iszerowidth(c))
253+                       return 0;
254+       }
255+#endif /* HAVE_CHARSET */
256+       return 1;
257+}
258+
259+static gboolean
260+mcview_ismark (const mcview_t * view, int c)
261+{
262+#ifdef HAVE_CHARSET
263+       if (view->utf8)
264+               return g_unichar_ismark(c);
265+#endif /* HAVE_CHARSET */
266+       return FALSE;
267+}
268+
269+/* actually is_non_spacing_mark_or_enclosing_mark */
270+static gboolean
271+mcview_is_non_spacing_mark (const mcview_t * view, int c)
272+{
273+#ifdef HAVE_CHARSET
274+       if (view->utf8) {
275+               GUnicodeType type = g_unichar_type(c);
276+               return type == G_UNICODE_NON_SPACING_MARK || type == G_UNICODE_ENCLOSING_MARK;
277+       }
278+#endif /* HAVE_CHARSET */
279+       return FALSE;
280+}
281+
282+#if 0
283+static gboolean
284+mcview_is_spacing_mark (const mcview_t * view, int c)
285+{
286+#ifdef HAVE_CHARSET
287+       if (view->utf8) {
288+               return g_unichar_type(c) == G_UNICODE_SPACING_MARK;
289+       }
290+#endif /* HAVE_CHARSET */
291+       return FALSE;
292+}
293+#endif /* 0 */
294+
295+static gboolean
296+mcview_isprint (const mcview_t * view, int c)
297+{
298+#ifdef HAVE_CHARSET
299+       if (!view->utf8)
300+               c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
301+       return g_unichar_isprint(c);
302+#endif /* HAVE_CHARSET */
303+       // TODO this is very-very buggy by design: ticket 3257 comments 0-1
304+       return is_printable (c);
305+}
306+
307+static int
308+mcview_char_display (const mcview_t * view, int c, char *s)
309+{
310+#ifdef HAVE_CHARSET
311+       if (mc_global.utf8_display) {
312+               if (!view->utf8)
313+                       c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
314+               if (!g_unichar_isprint (c))
315+                       c = '.';
316+               return g_unichar_to_utf8(c, s);
317+       } else if (view->utf8) {
318+               if (g_unichar_iswide(c)) {
319+                       s[0] = s[1] = '.';
320+                       return 2;
321+               }
322+               if (g_unichar_iszerowidth(c))
323+                       return 0;
324+               // TODO the is_printable check below will be broken for this
325+               c = convert_from_utf_to_current_c (c, view->converter);
326+       } else {
327+               // TODO the is_printable check below will be broken for this
328+               c = convert_to_display_c (c);
329+       }
330+#endif /* HAVE_CHARSET */
331+       // TODO this is very-very buggy by design: ticket 3257 comments 0-1
332+       if (!is_printable (c))
333+               c = '.';
334+       *s = c;
335+       return 1;
336+}
337+
338+/* --------------------------------------------------------------------------------------------- */
339+
340+/*
341+ * Just for convenience, a common interface in front of mcview_get_utf and mcview_get_byte, so that
342+ * the caller doesn't have to care about utf8 vs 8-bit modes.
343+ *
344+ * Normally: stores c, updates state, returns TRUE.
345+ * At EOF: state is unchanged, c is undefined, returns FALSE.
346+ *
347+ * Also, temporary hack: handle force_max here.
348+ * TODO: move it to lower layers (datasource.c)?
349+ */
350+static gboolean
351+mcview_get_next_char (mcview_t * view, mcview_state_machine_t * state, int *c)
352+{
353+       gboolean result;
354+       int bytes_consumed;
355+
356+       /* Pretend EOF if we reached force_max */
357+       if (view->force_max >= 0 && state->offset >= view->force_max) {
358+               return FALSE;
359+       }
360+#ifdef HAVE_CHARSET
361+        if (view->utf8)
362+        {
363+            *c = mcview_get_utf (view, state->offset, &bytes_consumed, &result);
364+            if (!result)
365+               return FALSE;
366+            /* Pretend EOF if we crossed force_max */
367+            if (view->force_max >= 0 && state->offset + bytes_consumed > view->force_max) {
368+                return FALSE;
369+            }
370+            state->offset += bytes_consumed;
371+            return TRUE;
372+        }
373+#endif /* HAVE_CHARSET */
374+        if (!mcview_get_byte (view, state->offset, c))
375+           return FALSE;
376+        state->offset++;
377+        return TRUE;
378+}
379+
380+/*
381+ * This function parses the next nroff character and gives it to you along with its desired color,
382+ * so you never have to care about nroff again.
383+ *
384+ * The nroff mode does the backspace trick for every single character (Unicode codepoint). At least
385+ * that's what the GNU groff 1.22 package produces, and that's what less 458 expects. For
386+ * double-wide characters (CJK), still only a single backspace is emitted. For combining accents
387+ * and such, the print-backspace-print step is repeated for the base character and then for each
388+ * accent separately.
389+ *
390+ * So, the right place for this layer is after the bytes are interpreted in UTF-8, but before
391+ * joining a base character with its combining accents.
392+ *
393+ * Normally: stores c and color, updates state, returns TRUE.
394+ * At EOF: state is unchanged, c and color are undefined, returns FALSE.
395+ *
396+ * color can be null if the caller doesn't care.
397+ */
398+static gboolean
399+mcview_get_next_maybe_nroff_char (mcview_t * view, mcview_state_machine_t * state, int *c, int *color)
400+{
401+       mcview_state_machine_t state_after_nroff;
402+       int c2, c3;
403+
404+       if (color) *color = VIEW_NORMAL_COLOR;
405+
406+       if (!view->text_nroff_mode)
407+               return mcview_get_next_char (view, state, c);
408+
409+       if (!mcview_get_next_char (view, state, c))
410+               return FALSE;
411+       /* Don't allow nroff formatting around CR, LF, TAB or other special chars */
412+       if (!mcview_isprint(view, *c))
413+               return TRUE;
414+
415+       state_after_nroff = *state;
416+
417+       if (!mcview_get_next_char (view, &state_after_nroff, &c2))
418+               return TRUE;
419+       if (c2 != '\b')
420+               return TRUE;
421+
422+       if (!mcview_get_next_char (view, &state_after_nroff, &c3))
423+               return TRUE;
424+       if (!mcview_isprint(view, c3))
425+               return TRUE;
426+
427+       if (*c == '_' && c3 == '_') {
428+               *state = state_after_nroff;
429+               if (color) *color = state->nroff_underscore_is_underlined ? VIEW_UNDERLINED_COLOR : VIEW_BOLD_COLOR;
430+               return TRUE;
431+       } else if (*c == c3) {
432+               *state = state_after_nroff;
433+               state->nroff_underscore_is_underlined = FALSE;
434+               if (color) *color = VIEW_BOLD_COLOR;
435+               return TRUE;
436+       } else if (*c == '_') {
437+               *c = c3;
438+               *state = state_after_nroff;
439+               state->nroff_underscore_is_underlined = TRUE;
440+               if (color) *color = VIEW_UNDERLINED_COLOR;
441+               return TRUE;
442+       } else {
443+               return TRUE;
444+       }
445+}
446+
447+/*
448+ * Get one base character, along with its combining or spacing mark characters.
449+ *
450+ * (A spacing mark is a character that extends the base character's width 1 into a combined
451+ * character of width 2, yet these two character cells should not be separated. E.g. Devanagari
452+ * <U+0939><U+094B>.)
453+ *
454+ * This method exists mainly for two reasons. One is to be able to tell if we fit on the current
455+ * line or need to wrap to the next one. The other is that both slang and ncurses seem to require
456+ * that the character and its combining marks are printed in a single call (or is it just a
457+ * limitation of mc's wrapper to them?).
458+ *
459+ * For convenience, this method takes care of converting CR or CR+LF into LF.
460+ * TODO this should probably happen later, when displaying the file?
461+ *
462+ * Normally: stores cs and color, updates state, returns >= 1 (entries in cs).
463+ * At EOF: state is unchanged, cs and color are undefined, returns 0.
464+ *
465+ * @param view ...
466+ * @param state the parser-formatter state machine's state, updated
467+ * @param cs store the characters here
468+ * @param clen the room available in cs (that is, at most clen-1 combining marks are allowed), must
469+ *   be at least 2
470+ * @param color if non-NULL, store the color here, taken from the first codepoint's color
471+ * @return the number of entries placed in cs, or 0 on EOF
472+ */
473+static int
474+mcview_next_complex_char (mcview_t * view, mcview_state_machine_t * state, int *cs, int clen, int *color)
475+{
476+       int i = 1;
477+       mcview_state_machine_t state_after_combining;
478+
479+       if (!mcview_get_next_maybe_nroff_char (view, state, cs, color))
480+               return 0;
481+
482+       /* Process \r and \r\n newlines. */
483+       if (cs[0] == '\r') {
484+               int cnext;
485+               mcview_state_machine_t state_after_crlf = *state;
486+               if (mcview_get_next_maybe_nroff_char (view, &state_after_crlf, &cnext, NULL) && cnext == '\n')
487+                       *state = state_after_crlf;
488+               cs[0] = '\n';
489+               return 1;
490+       }
491+
492+       /* We don't want combining over non-printable characters. This includes '\n' and '\t' too. */
493+       if (!mcview_isprint(view, cs[0]))
494+               return 1;
495+
496+       if (mcview_ismark(view, cs[0])) {
497+               if (!state->print_lonely_combining) {
498+                       /* First character is combining. Either just return it, ... */
499+                       return 1;
500+               } else {
501+                       /* or place this (and subsequent combining ones) over a dotted circle. */
502+                       cs[1] = cs[0];
503+                       cs[0] = BASE_CHARACTER_FOR_LONELY_COMBINING;
504+                       i = 2;
505+               }
506+       }
507+
508+       if (mcview_wcwidth(view, cs[0]) == 2) {
509+               /* Don't allow combining or spacing mark for wide characters, is this okay? */
510+               return 1;
511+       }
512+
513+       /* Look for more combining chars. Either at most clen-1 zero-width combining chars,
514+        * or at most 1 spacing mark. Is this logic correct? */
515+       for (; i < clen; i++) {
516+               state_after_combining = *state;
517+               if (!mcview_get_next_maybe_nroff_char (view, &state_after_combining, &cs[i], NULL))
518+                       return i;
519+               if (!mcview_ismark(view, cs[i]) || !mcview_isprint(view, cs[i]))
520+                       return i;
521+               if (g_unichar_type(cs[i]) == G_UNICODE_SPACING_MARK) {  // is this the right check?
522+                       /* Only allow as the first combining char. Stop processing in either case. */
523+                       if (i == 1) {
524+                               *state = state_after_combining;
525+                               i++;
526+                       }
527+                       return i;
528+               }
529+               *state = state_after_combining;
530+       }
531+       return i;
532+}
533+
534+/*
535+ * Parse, format and possibly display one visual line of text.
536+ *
537+ * Formatting starts at the given "state" (which encodes the file offset and parser and formatter's
538+ * internal state). In unwrap mode, this should point to the beginning of the paragraph with the
539+ * default state, the additional horizontal scrolling is added here. In wrap mode, this should
540+ * point to the beginning of the line, with the proper state at that point.
541+ *
542+ * In wrap mode, if a line ends in a newline, it is consumed, even if it's exactly at the right
543+ * edge. In unwrap mode, the whole remaining line, including the newline is consumed. Displaying
544+ * the next line should start at "state"'s new value, or if we displayed the bottom line then
545+ * state->offset tells the file offset to be shown in the top bar.
546+ *
547+ * If "row" is offscreen, don't actually display the line but still update "state" and return the
548+ * proper value. This is used by mcview_wrap_move_down to advance in the file.
549+ *
550+ * @param view ...
551+ * @param state the parser-formatter state machine's state, updated
552+ * @param row print to this row
553+ * @param paragraph_ended store TRUE if paragraph ended by newline or EOF, FALSE if wraps to next
554+ *   line
555+ * @return the number of rows, that is, 0 if we were already at EOF, otherwise 1
556+ */
557+static int
558+mcview_display_line (mcview_t * view, mcview_state_machine_t * state, int row, gboolean *paragraph_ended)
559+{
560+       const screen_dimen left = view->data_area.left;
561+       const screen_dimen top = view->data_area.top;
562+       const screen_dimen width = view->data_area.width;
563+       const screen_dimen height = view->data_area.height;
564+       off_t dpy_text_column = view->text_wrap_mode ? 0 : view->dpy_text_column;  // actually maybe we shouldn't allow view->data_area.left to be any different
565+       screen_dimen col = 0;
566+       int color;
567+       int cs[1 + MAX_COMBINING_CHARS];
568+       int n;
569+       char str[(1 + MAX_COMBINING_CHARS) * UTF8_CHAR_LEN + 1];
570+       int charwidth;
571+       int i, j;
572+       mcview_state_machine_t state_saved;
573+
574+       if (paragraph_ended) *paragraph_ended = TRUE;
575+
576+       if (!view->text_wrap_mode && col >= dpy_text_column + width) {
577+               /* Optimization: Fast forward to the end of the line, rather than carefully
578+                * parsing and then not actually displaying it. */
579+               off_t eol = mcview_eol(view, state->offset, mcview_get_filesize (view));
580+               int retval = eol > state->offset ? 1 : 0;
581+               mcview_state_machine_init (state, eol);
582+               return retval;
583+       }
584+
585+       while (1) {
586+               state_saved = *state;
587+               n = mcview_next_complex_char (view, state, cs, 1 + MAX_COMBINING_CHARS, &color);
588+               if (n == 0)
589+                       return col > 0 ? 1 : 0;
590+
591+               if (view->search_start <= state->offset && state->offset < view->search_end)
592+                       color = SELECTED_COLOR;
593+
594+               if (cs[0] == '\n') {
595+                       /* New line: reset all formatting state for the next paragraph. */
596+                       mcview_state_machine_init (state, state->offset);
597+                       return 1;
598+               }
599+
600+               if (mcview_is_non_spacing_mark(view, cs[0])) {
601+                       /* Lonely combining character. Probably leftover after too many combining chars. Just ignore. */
602+                       continue;
603+               }
604+
605+               /* Nonprintable, or lonely spacing mark */
606+               if ((!mcview_isprint(view, cs[0]) || mcview_ismark(view, cs[0])) && cs[0] != '\t')
607+                       cs[0] = '.';
608+
609+               charwidth = 0;
610+               for (i = 0; i < n; i++) {
611+                       charwidth += mcview_wcwidth(view, cs[i]);
612+               }
613+
614+               /* Adjust the width for TAB. It's handled below along with the normal characters,
615+                * so that it's wrapped consistently with them, and is painted with the proper
616+                * attributes (although currently it can't have a special color). */
617+               if (cs[0] == '\t') {
618+                       charwidth = option_tab_spacing - state->unwrapped_column % option_tab_spacing;
619+                       state->print_lonely_combining = TRUE;
620+               } else {
621+                       state->print_lonely_combining = FALSE;
622+               }
623+
624+               /* In wrap mode only: We're done with this row if the complex character wouldn't
625+                * fit. Except if at the first column, because then it wouldn't fit in the next row
626+                * either. In this extreme case let the unwrapped code below do its best to display
627+                * it. */
628+               if (view->text_wrap_mode && (off_t) col + charwidth > dpy_text_column + width && col > 0) {
629+                       *state = state_saved;
630+                       if (paragraph_ended) *paragraph_ended = FALSE;
631+                       return 1;
632+               }
633+
634+               /* Display, unless outside of the viewport. */
635+               if (row >= 0 && row < (int) height) {
636+                       if ((off_t) col >= dpy_text_column &&
637+                           (off_t) col + charwidth <= dpy_text_column + width) {
638+                               /* The complex character fits entirely in the viewport. Print it. */
639+                               tty_setcolor(color);
640+                               widget_move (view, top + row, left + ((off_t) col - dpy_text_column));
641+                               if (cs[0] == '\t') {
642+                                       for (i = 0; i < charwidth; i++)
643+                                               tty_print_char(' ');
644+                               } else {
645+                                       j = 0;
646+                                       for (i = 0; i < n; i++) {
647+                                               j += mcview_char_display(view, cs[i], str + j);
648+                                       }
649+                                       str[j] = '\0';
650+                                       /* This is probably a bug in our tty layer, but tty_print_string
651+                                        * normalizes the string, whereas tty_printf doesn't. Don't normalize,
652+                                        * since we handle combining characters ourselves correctly, it's
653+                                        * better if they are copy-pasted correctly. Ticket 3255. */
654+                                       tty_printf ("%s", str);
655+                               }
656+                       } else if ((off_t) col < dpy_text_column &&
657+                                  (off_t) col + charwidth > dpy_text_column) {
658+                               /* The complex character would cross the left edge of the viewport.
659+                                * This cannot happen with wrap mode. Print replacement character(s),
660+                                * or spaces with the correct attributes for partial Tabs. */
661+                               tty_setcolor(color);
662+                               for (i = dpy_text_column; i < (off_t) col + charwidth && i < dpy_text_column + width; i++) {
663+                                       widget_move (view, top + row, left + (i - dpy_text_column));
664+                                       tty_print_anychar(cs[0] == '\t' ? ' ' : PARTIAL_CJK_AT_LEFT_MARGIN);
665+                               }
666+                       } else if ((off_t) col < dpy_text_column + width &&
667+                                  (off_t) col + charwidth > dpy_text_column + width) {
668+                               /* The complex character would cross the right edge of the viewport
669+                                * and we're not wrapping. Print replacement character(s),
670+                                * or spaces with the correct attributes for partial Tabs. */
671+                               tty_setcolor(color);
672+                               for (i = col; i < dpy_text_column + width; i++) {
673+                                       widget_move (view, top + row, left + (i - dpy_text_column));
674+                                       tty_print_anychar(cs[0] == '\t' ? ' ' : PARTIAL_CJK_AT_RIGHT_MARGIN);
675+                               }
676+                       }
677+               }
678+
679+               col += charwidth;
680+               state->unwrapped_column += charwidth;
681+
682+               if (!view->text_wrap_mode && col >= dpy_text_column + width) {
683+                       /* Optimization: Fast forward to the end of the line, rather than carefully
684+                        * parsing and then not actually displaying it. */
685+                       off_t eol = mcview_eol(view, state->offset, mcview_get_filesize (view));
686+                       mcview_state_machine_init (state, eol);
687+                       return 1;
688+               }
689+       }
690+}
691+
692+/*
693+ * Parse, format and possibly display one paragraph (perhaps not from the beginning).
694+ *
695+ * Formatting starts at the given "state" (which encodes the file offset and parser and formatter's
696+ * internal state). In unwrap mode, this should point to the beginning of the paragraph with the
697+ * default state, the additional horizontal scrolling is added here. In wrap mode, this may point
698+ * to the beginning of the line within a paragraph (to display the partial paragraph at the top),
699+ * with the proper state at that point.
700+ *
701+ * Displaying the next paragraph should start at "state"'s new value, or if we displayed the bottom
702+ * line then state->offset tells the file offset to be shown in the top bar.
703+ *
704+ * If "row" is negative, don't display the first abs(row) lines and display the rest from the top.
705+ * This was a nice idea but it's now unused :)
706+ *
707+ * If "row" is too large, don't display the paragraph at all but still return the number of lines.
708+ * This is used when moving upwards.
709+ *
710+ * @param view ...
711+ * @param state the parser-formatter state machine's state, updated
712+ * @param row print starting at this row
713+ * @return the number of rows the paragraphs is wrapped to, that is, 0 if we were already at EOF,
714+ *   otherwise 1 in unwrap mode, >= 1 in wrap mode. We stop when reaching the bottom of the
715+ *   viewport, it's not counted how many more lines the paragraph would occupy
716+ */
717+static int
718+mcview_display_paragraph (mcview_t * view, mcview_state_machine_t * state, int row)
719+{
720+       const screen_dimen height = view->data_area.height;
721+       int lines = 0;
722+       gboolean paragraph_ended;
723+
724+       while (1) {
725+               lines += mcview_display_line(view, state, row, &paragraph_ended);
726+               if (paragraph_ended)
727+                       return lines;
728+
729+               if (row < (int) height) {
730+                       row++;
731+                       /* stop if bottom of screen reached */
732+                       if (row >= (int) height)
733+                               return lines;
734+               }
735+       }
736+}
737+
738+/*
739+ * Recompute dpy_state_top from dpy_start and dpy_paragraph_skip_lines. Clamp
740+ * dpy_paragraph_skip_lines if necessary.
741+ *
742+ * This method should be called in wrap mode after changing one of the parsing or formatting
743+ * properties (e.g. window width, encoding, nroff), or when switching to wrap mode from unwrap or
744+ * hex.
745+ *
746+ * If we stayed within the same paragraph then try to keep the vertical offset within that
747+ * paragraph as well. It might happen though that the paragraph became shorter than our desired
748+ * vertical position, in that case move to its last row.
749+ */
750+static void
751+mcview_wrap_fixup (mcview_t * view)
752+{
753+       mcview_state_machine_t state_prev;
754+       gboolean paragraph_ended;
755+       int lines = view->dpy_paragraph_skip_lines;
756+
757+       if (!view->dpy_wrap_dirty)
758+               return;
759+       view->dpy_wrap_dirty = FALSE;
760+
761+       view->dpy_paragraph_skip_lines = 0;
762+       mcview_state_machine_init (&view->dpy_state_top, view->dpy_start);
763+
764+       while (lines--) {
765+               state_prev = view->dpy_state_top;
766+               if (!mcview_display_line (view, &view->dpy_state_top, -1, &paragraph_ended))
767+                       break;
768+               if (paragraph_ended) {
769+                       view->dpy_state_top = state_prev;
770+                       break;
771+               }
772+               view->dpy_paragraph_skip_lines++;
773+       }
774+}
775+
776+/* --------------------------------------------------------------------------------------------- */
777+/*** public functions ****************************************************************************/
778+/* --------------------------------------------------------------------------------------------- */
779+
780+/*
781+ * In both wrap and unwrap modes, dpy_start points to the beginning of the paragraph.
782+ *
783+ * In unwrap mode, start displaying from this position, probably applying an additional horizontal
784+ * scroll.
785+ *
786+ * In wrap mode, an additional dpy_paragraph_skip_lines lines are skipped from the top of this
787+ * paragraph. dpy_state_top contains the position and parser-formatter state corresponding to the
788+ * top left corner so we can just start rendering from here. Unless dpy_wrap_dirty is set in which
789+ * case dpy_state_top is invalid and we need to recompute first.
790+ */
791+void
792+mcview_display_text (mcview_t * view)
793+{
794+       const screen_dimen left = view->data_area.left;
795+       const screen_dimen top = view->data_area.top;
796+       const screen_dimen height = view->data_area.height;
797+       int row;
798+       int n;
799+       mcview_state_machine_t state;
800+       gboolean again;
801+
802+       do {
803+               again = FALSE;
804+
805+               mcview_display_clean (view);
806+               mcview_display_ruler (view);
807+
808+               if (view->text_wrap_mode) {
809+                       mcview_wrap_fixup (view);
810+                       state = view->dpy_state_top;
811+               } else {
812+                       mcview_state_machine_init(&state, view->dpy_start);
813+               }
814+               row = 0;
815+               while (row < (int) height) {
816+                       n = mcview_display_paragraph (view, &state, row);
817+                       if (n == 0) {
818+                               /* This is quite ugly here. In the rare case that displaying didn't
819+                                * start at the beginning of the file, yet there are some empty
820+                                * lines at the bottom, scroll the file and display again. This
821+                                * happens when e.g. the window is made bigger, or the file becomes
822+                                * shorter due to charset change or enabling nroff.
823+                                *
824+                                * TODO: probably set some dirty bit when any of these changes, and
825+                                * perform a blind rendering first to see if it would fit, before
826+                                * talking to slang/ncurses. (Although they do buffering too, so
827+                                * it's not that bad. And that approach would render twice and
828+                                * hence be slower in the typical case.)
829+                                */
830+                               if ((view->text_wrap_mode ? view->dpy_state_top.offset : view->dpy_start) > 0) {
831+                                       mcview_ascii_move_up (view, height - row);
832+                                       again = TRUE;
833+                               }
834+                               break;
835+                       }
836+                       row += n;
837+               }
838+       } while (again);
839+
840+       view->dpy_end = state.offset;
841+       view->dpy_state_bottom = state;
842+
843+       if (mcview_show_eof != NULL && mcview_show_eof[0] != '\0') {
844+               while (row < (int) height) {
845+                       widget_move (view, top + row, left);
846+                       // TODO: should make it no wider than the viewport
847+                       tty_print_string (mcview_show_eof);
848+                       row++;
849+               }
850+       }
851+}
852+
853+/*
854+ * Move down.
855+ *
856+ * It's very simple. Just invisibly format the next "lines" lines, carefully carrying the formatter
857+ * state in wrap mode. But before each step we need to check if we've already hit the end of the
858+ * file, in that case we can no longer move. This is done by walking from dpy_state_bottom.
859+ *
860+ * Note that this relies on mcview_display_text() setting dpy_state_bottom to its correct value
861+ * upon rendering the screen contents. So don't call this function from other functions (e.g. at
862+ * the bottom of mcview_ascii_move_up()) which invalidate this value.
863+ */
864+void
865+mcview_ascii_move_down (mcview_t * view, off_t lines)
866+{
867+       gboolean paragraph_ended;
868+
869+       while (lines--) {
870+               /* See if there's still data below the bottom line. If not, we can't scroll any
871+                * more. If there is, adjust dpy_state_bottom by imaginarily displaying one more
872+                * line there. */
873+               if (view->dpy_state_bottom.offset >= mcview_get_filesize (view))
874+                       break;
875+               mcview_display_line (view, &view->dpy_state_bottom, -1, &paragraph_ended);
876+
877+               /* Okay, there's enough data. Move by 1 row at the top, too. No need to check for
878+                * EOF, that can't happen. */
879+               if (!view->text_wrap_mode) {
880+                       view->dpy_start = mcview_eol(view, view->dpy_start, mcview_get_filesize (view));
881+                       view->dpy_paragraph_skip_lines = 0;
882+                       view->dpy_wrap_dirty = TRUE;
883+               } else {
884+                       mcview_display_line (view, &view->dpy_state_top, -1, &paragraph_ended);
885+                       if (paragraph_ended) {
886+                               view->dpy_start = view->dpy_state_top.offset;
887+                               view->dpy_paragraph_skip_lines = 0;
888+                       } else {
889+                               view->dpy_paragraph_skip_lines++;
890+                       }
891+               }
892+       }
893+}
894+
895+/*
896+ * Move up.
897+ *
898+ * Unwrap mode: Piece of cake. Wrap mode: If we'd walk back more than the current line offset
899+ * within the paragraph, we need to jump back to the previous paragraph and compute its height to
900+ * see if we start from that paragraph, and repeat this if necessary. Once we're within the desired
901+ * paragraph, we still need to format it from its beginning to know the state.
902+ *
903+ * See the top of this file for comments about MAX_BACKWARDS_WALK_IN_PARAGRAPH.
904+ *
905+ * force_max is a nice protection against the rare extreme case that the file underneath us
906+ * changes, we don't want to endlessly consume a file of maybe full of zeros upon moving upwards.
907+ */
908+void
909+mcview_ascii_move_up (mcview_t * view, off_t lines)
910+{
911+       int i;
912+
913+       if (!view->text_wrap_mode) {
914+               while (lines--)
915+                       view->dpy_start = mcview_bol(view, view->dpy_start - 1, 0);
916+               view->dpy_paragraph_skip_lines = 0;
917+               view->dpy_wrap_dirty = TRUE;
918+       } else {
919+               while (lines > view->dpy_paragraph_skip_lines) {
920+                       /* We need to go back to the previous paragraph. */
921+                       if (view->dpy_start == 0) {
922+                               /* Oops, we're already in the first paragraph. */
923+                               view->dpy_paragraph_skip_lines = 0;
924+                               mcview_state_machine_init(&view->dpy_state_top, 0);
925+                               return;
926+                       }
927+                       lines -= view->dpy_paragraph_skip_lines;
928+                       view->force_max = view->dpy_start;
929+                       view->dpy_start = mcview_bol (view, view->dpy_start - 1, view->dpy_start - MAX_BACKWARDS_WALK_IN_PARAGRAPH);
930+                       mcview_state_machine_init(&view->dpy_state_top, view->dpy_start);
931+                       /* This is a tricky way of denoting that we're at the end of the paragraph.
932+                        * Normally we'd jump to the next paragraph and reset paragraph_skip_lines. But for
933+                        * walking backwards this is exactly what we need. */
934+                       view->dpy_paragraph_skip_lines = mcview_display_paragraph (view, &view->dpy_state_top, view->data_area.height);
935+                       view->force_max = -1;
936+               }
937+
938+               /* Okay, we have have dpy_start pointing to the desired paragraph, and we still need to
939+                * walk back "lines" lines from the current "dpy_paragraph_skip_lines" offset. We can't do
940+                * that, so walk from the beginning of the paragraph. */
941+               mcview_state_machine_init(&view->dpy_state_top, view->dpy_start);
942+               view->dpy_paragraph_skip_lines -= lines;
943+               for (i = 0; i < view->dpy_paragraph_skip_lines; i++)
944+                       mcview_display_line (view, &view->dpy_state_top, -1, NULL);
945+       }
946+}
947+
948+/* --------------------------------------------------------------------------------------------- */
949+
950+void
951+mcview_state_machine_init (mcview_state_machine_t * state, off_t offset)
952+{
953+       memset(state, 0, sizeof (*state));
954+       state->offset = offset;
955+       state->print_lonely_combining = TRUE;
956+}
957+
958+/* --------------------------------------------------------------------------------------------- */
959diff --git a/src/viewer/datasource.c b/src/viewer/datasource.c
960index 9ec5ab4..97b04c6 100644
961--- a/src/viewer/datasource.c
962+++ b/src/viewer/datasource.c
963@@ -164,7 +164,7 @@ mcview_get_ptr_string (mcview_t * view, off_t byte_index)
964 /* --------------------------------------------------------------------------------------------- */
965 
966 int
967-mcview_get_utf (mcview_t * view, off_t byte_index, int *char_width, gboolean * result)
968+mcview_get_utf (mcview_t * view, off_t byte_index, int *bytes_consumed, gboolean * result)
969 {
970     gchar *str = NULL;
971     int res = -1;
972@@ -172,7 +172,7 @@ mcview_get_utf (mcview_t * view, off_t byte_index, int *char_width, gboolean * r
973     gchar *next_ch = NULL;
974     gchar utf8buf[UTF8_CHAR_LEN + 1];
975 
976-    *char_width = 0;
977+    *bytes_consumed = 0;
978     *result = FALSE;
979 
980     switch (view->datasource)
981@@ -218,7 +218,7 @@ mcview_get_utf (mcview_t * view, off_t byte_index, int *char_width, gboolean * r
982     if (res < 0)
983     {
984         ch = *str;
985-        *char_width = 1;
986+        *bytes_consumed = 1;
987     }
988     else
989     {
990@@ -226,7 +226,7 @@ mcview_get_utf (mcview_t * view, off_t byte_index, int *char_width, gboolean * r
991         /* Calculate UTF-8 char width */
992         next_ch = g_utf8_next_char (str);
993         if (next_ch)
994-            *char_width = next_ch - str;
995+            *bytes_consumed = next_ch - str;
996         else
997             return 0;
998     }
999diff --git a/src/viewer/display.c b/src/viewer/display.c
1000index 00c6ec0..b1bd390 100644
1001--- a/src/viewer/display.c
1002+++ b/src/viewer/display.c
1003@@ -251,10 +251,6 @@ mcview_display (mcview_t * view)
1004     {
1005         mcview_display_hex (view);
1006     }
1007-    else if (view->text_nroff_mode)
1008-    {
1009-        mcview_display_nroff (view);
1010-    }
1011     else
1012     {
1013         mcview_display_text (view);
1014diff --git a/src/viewer/internal.h b/src/viewer/internal.h
1015index f172c5b..319f77e 100644
1016--- a/src/viewer/internal.h
1017+++ b/src/viewer/internal.h
1018@@ -87,6 +87,18 @@ typedef struct
1019     coord_cache_entry_t **cache;
1020 } coord_cache_t;
1021 
1022+// TODO: find a better name. This is not actually a "state machine",
1023+// but a "state machine's state", but that sounds silly.
1024+// Could be parser_state, formatter_state...
1025+typedef struct
1026+{
1027+    off_t offset;               /* The file offset at which this is the state. */
1028+    off_t unwrapped_column;     /* Columns if the paragraph wasn't wrapped, */
1029+                                /* used for positioning TABs in wrapped lines */
1030+    gboolean nroff_underscore_is_underlined;  /* whether _\b_ is underlined rather than bold */
1031+    gboolean print_lonely_combining;   /* whether lonely combining marks are printed on a dotted circle */
1032+} mcview_state_machine_t;
1033+
1034 struct mcview_nroff_struct;
1035 
1036 struct mcview_struct
1037@@ -140,8 +152,12 @@ struct mcview_struct
1038 
1039     /* Display information */
1040     screen_dimen dpy_frame_size;        /* Size of the frame surrounding the real viewer */
1041-    off_t dpy_start;            /* Offset of the displayed data */
1042+    off_t dpy_start;            /* Offset of the displayed data (start of the paragraph in non-hex mode) */
1043     off_t dpy_end;              /* Offset after the displayed data */
1044+    off_t dpy_paragraph_skip_lines; /* Extra lines to skip in wrap mode */
1045+    mcview_state_machine_t dpy_state_top;  /* Parser-formatter state at the topmost visible line in wrap mode */
1046+    mcview_state_machine_t dpy_state_bottom;  /* Parser-formatter state after the bottomvisible line in wrap mode */
1047+    gboolean dpy_wrap_dirty;    /* dpy_state_top needs to be recomputed */
1048     off_t dpy_text_column;      /* Number of skipped columns in non-wrap
1049                                  * text mode */
1050     off_t hex_cursor;           /* Hexview cursor position in file */
1051@@ -152,6 +168,8 @@ struct mcview_struct
1052     struct area ruler_area;     /* Where the ruler is displayed */
1053     struct area data_area;      /* Where the data is displayed */
1054 
1055+    ssize_t force_max;          /* Force a max offset, or -1 */
1056+
1057     int dirty;                  /* Number of skipped updates */
1058     gboolean dpy_bbar_dirty;    /* Does the button bar need to be updated? */
1059 
1060@@ -219,6 +237,12 @@ cb_ret_t mcview_callback (Widget * w, Widget * sender, widget_msg_t msg, int par
1061 cb_ret_t mcview_dialog_callback (Widget * w, Widget * sender, widget_msg_t msg, int parm,
1062                                  void *data);
1063 
1064+/* ascii.c: */
1065+void mcview_display_text (mcview_t *);
1066+void mcview_state_machine_init (mcview_state_machine_t *, off_t);
1067+void mcview_ascii_move_down (mcview_t *, off_t);
1068+void mcview_ascii_move_up (mcview_t *, off_t);
1069+
1070 /* coord_cache.c: */
1071 coord_cache_t *coord_cache_new (void);
1072 void coord_cache_free (coord_cache_t * cache);
1073@@ -307,9 +331,7 @@ void mcview_place_cursor (mcview_t *);
1074 void mcview_moveto_match (mcview_t *);
1075 
1076 /* nroff.c: */
1077-void mcview_display_nroff (mcview_t * view);
1078 int mcview__get_nroff_real_len (mcview_t * view, off_t, off_t p);
1079-
1080 mcview_nroff_t *mcview_nroff_seq_new_num (mcview_t * view, off_t p);
1081 mcview_nroff_t *mcview_nroff_seq_new (mcview_t * view);
1082 void mcview_nroff_seq_free (mcview_nroff_t **);
1083@@ -317,10 +339,6 @@ nroff_type_t mcview_nroff_seq_info (mcview_nroff_t *);
1084 int mcview_nroff_seq_next (mcview_nroff_t *);
1085 int mcview_nroff_seq_prev (mcview_nroff_t *);
1086 
1087-
1088-/* plain.c: */
1089-void mcview_display_text (mcview_t *);
1090-
1091 /* search.c: */
1092 mc_search_cbret_t mcview_search_cmd_callback (const void *user_data, gsize char_offset,
1093                                               int *current_char);
1094diff --git a/src/viewer/lib.c b/src/viewer/lib.c
1095index c996c45..46b12c9 100644
1096--- a/src/viewer/lib.c
1097+++ b/src/viewer/lib.c
1098@@ -106,9 +106,8 @@ mcview_toggle_magic_mode (mcview_t * view)
1099 void
1100 mcview_toggle_wrap_mode (mcview_t * view)
1101 {
1102-    if (view->text_wrap_mode)
1103-        view->dpy_start = mcview_bol (view, view->dpy_start, 0);
1104     view->text_wrap_mode = !view->text_wrap_mode;
1105+    view->dpy_wrap_dirty = TRUE;
1106     view->dpy_bbar_dirty = TRUE;
1107     view->dirty++;
1108 }
1109@@ -120,6 +119,7 @@ mcview_toggle_nroff_mode (mcview_t * view)
1110 {
1111     view->text_nroff_mode = !view->text_nroff_mode;
1112     mcview_altered_nroff_flag = 1;
1113+    view->dpy_wrap_dirty = TRUE;
1114     view->dpy_bbar_dirty = TRUE;
1115     view->dirty++;
1116 }
1117@@ -144,6 +144,8 @@ mcview_toggle_hex_mode (mcview_t * view)
1118         widget_want_cursor (WIDGET (view), FALSE);
1119     }
1120     mcview_altered_hex_mode = 1;
1121+    view->dpy_paragraph_skip_lines = 0;
1122+    view->dpy_wrap_dirty = TRUE;
1123     view->dpy_bbar_dirty = TRUE;
1124     view->dirty++;
1125 }
1126@@ -170,6 +172,10 @@ mcview_init (mcview_t * view)
1127     view->coord_cache = NULL;
1128 
1129     view->dpy_start = 0;
1130+    view->dpy_paragraph_skip_lines = 0;
1131+    mcview_state_machine_init (&view->dpy_state_top, 0);
1132+    view->dpy_wrap_dirty = FALSE;
1133+    view->force_max = -1;
1134     view->dpy_text_column = 0;
1135     view->dpy_end = 0;
1136     view->hex_cursor = 0;
1137@@ -283,6 +289,7 @@ mcview_set_codeset (mcview_t * view)
1138             view->converter = conv;
1139         }
1140         view->utf8 = (gboolean) str_isutf8 (cp_id);
1141+        view->dpy_wrap_dirty = TRUE;
1142     }
1143 #else
1144     (void) view;
1145@@ -340,7 +347,7 @@ mcview_bol (mcview_t * view, off_t current, off_t limit)
1146         if (c == '\r')
1147             current--;
1148     }
1149-    while (current > 0 && current >= limit)
1150+    while (current > 0 && current > limit)
1151     {
1152         if (!mcview_get_byte (view, current - 1, &c))
1153             break;
1154diff --git a/src/viewer/mcviewer.c b/src/viewer/mcviewer.c
1155index f55eecf..0a009ba 100644
1156--- a/src/viewer/mcviewer.c
1157+++ b/src/viewer/mcviewer.c
1158@@ -397,6 +397,10 @@ mcview_load (mcview_t * view, const char *command, const char *file, int start_l
1159   finish:
1160     view->command = g_strdup (command);
1161     view->dpy_start = 0;
1162+    view->dpy_paragraph_skip_lines = 0;
1163+    mcview_state_machine_init (&view->dpy_state_top, 0);
1164+    view->dpy_wrap_dirty = FALSE;
1165+    view->force_max = -1;
1166     view->search_start = 0;
1167     view->search_end = 0;
1168     view->dpy_text_column = 0;
1169@@ -416,7 +420,10 @@ mcview_load (mcview_t * view, const char *command, const char *file, int start_l
1170         else
1171             new_offset = min (new_offset, max_offset);
1172         if (!view->hex_mode)
1173+        {
1174             view->dpy_start = mcview_bol (view, new_offset, 0);
1175+            view->dpy_wrap_dirty = TRUE;
1176+        }
1177         else
1178         {
1179             view->dpy_start = new_offset - new_offset % view->bytes_per_line;
1180diff --git a/src/viewer/move.c b/src/viewer/move.c
1181index 7cd852b..c8facc5 100644
1182--- a/src/viewer/move.c
1183+++ b/src/viewer/move.c
1184@@ -83,6 +83,8 @@ mcview_scroll_to_cursor (mcview_t * view)
1185         if (cursor < topleft)
1186             topleft = mcview_offset_rounddown (cursor, bytes);
1187         view->dpy_start = topleft;
1188+        view->dpy_paragraph_skip_lines = 0;
1189+        view->dpy_wrap_dirty = TRUE;
1190     }
1191 }
1192 
1193@@ -107,64 +109,24 @@ mcview_movement_fixups (mcview_t * view, gboolean reset_search)
1194 void
1195 mcview_move_up (mcview_t * view, off_t lines)
1196 {
1197-    off_t new_offset;
1198-
1199     if (view->hex_mode)
1200     {
1201         off_t bytes = lines * view->bytes_per_line;
1202         if (view->hex_cursor >= bytes)
1203         {
1204             view->hex_cursor -= bytes;
1205-            if (view->hex_cursor < view->dpy_start)
1206+            if (view->hex_cursor < view->dpy_start) {
1207                 view->dpy_start = mcview_offset_doz (view->dpy_start, bytes);
1208+                view->dpy_paragraph_skip_lines = 0;
1209+                view->dpy_wrap_dirty = TRUE;
1210+            }
1211         }
1212         else
1213         {
1214             view->hex_cursor %= view->bytes_per_line;
1215         }
1216-    }
1217-    else
1218-    {
1219-        off_t i;
1220-
1221-        for (i = 0; i < lines; i++)
1222-        {
1223-            if (view->dpy_start == 0)
1224-                break;
1225-            if (view->text_wrap_mode)
1226-            {
1227-                new_offset = mcview_bol (view, view->dpy_start, view->dpy_start - (off_t) 1);
1228-                /* check if dpy_start == BOL or not (then new_offset = dpy_start - 1,
1229-                 * no need to check more) */
1230-                if (new_offset == view->dpy_start)
1231-                {
1232-                    size_t last_row_length;
1233-
1234-                    new_offset = mcview_bol (view, new_offset - 1, 0);
1235-                    last_row_length = (view->dpy_start - new_offset) % view->data_area.width;
1236-                    if (last_row_length != 0)
1237-                    {
1238-                        /* if dpy_start == BOL in wrapped mode, find BOL of previous line
1239-                         * and move down all but the last rows */
1240-                        new_offset = view->dpy_start - (off_t) last_row_length;
1241-                    }
1242-                }
1243-                else
1244-                {
1245-                    /* if dpy_start != BOL in wrapped mode, just move one row up;
1246-                     * no need to check if > 0 as there is at least exactly one wrap
1247-                     * between dpy_start and BOL */
1248-                    new_offset = view->dpy_start - (off_t) view->data_area.width;
1249-                }
1250-                view->dpy_start = new_offset;
1251-            }
1252-            else
1253-            {
1254-                /* if unwrapped -> current BOL equals dpy_start, just find BOL of previous line */
1255-                new_offset = view->dpy_start - 1;
1256-                view->dpy_start = mcview_bol (view, new_offset, 0);
1257-            }
1258-        }
1259+    } else {
1260+        mcview_ascii_move_up (view, lines);
1261     }
1262     mcview_movement_fixups (view, TRUE);
1263 }
1264@@ -187,52 +149,14 @@ mcview_move_down (mcview_t * view, off_t lines)
1265         for (i = 0; i < lines && view->hex_cursor < limit; i++)
1266         {
1267             view->hex_cursor += view->bytes_per_line;
1268-            if (lines != 1)
1269+            if (lines != 1) {
1270                 view->dpy_start += view->bytes_per_line;
1271+                view->dpy_paragraph_skip_lines = 0;
1272+                view->dpy_wrap_dirty = TRUE;
1273+           }
1274         }
1275-    }
1276-    else
1277-    {
1278-        off_t new_offset = 0;
1279-
1280-        if (view->dpy_end - view->dpy_start > last_byte - view->dpy_end)
1281-        {
1282-            while (lines-- > 0)
1283-            {
1284-                if (view->text_wrap_mode)
1285-                    view->dpy_end =
1286-                        mcview_eol (view, view->dpy_end,
1287-                                    view->dpy_end + (off_t) view->data_area.width);
1288-                else
1289-                    view->dpy_end = mcview_eol (view, view->dpy_end, last_byte);
1290-
1291-                if (view->text_wrap_mode)
1292-                    new_offset =
1293-                        mcview_eol (view, view->dpy_start,
1294-                                    view->dpy_start + (off_t) view->data_area.width);
1295-                else
1296-                    new_offset = mcview_eol (view, view->dpy_start, last_byte);
1297-                if (new_offset < last_byte)
1298-                    view->dpy_start = new_offset;
1299-                if (view->dpy_end >= last_byte)
1300-                    break;
1301-            }
1302-        }
1303-        else
1304-        {
1305-            off_t i;
1306-            for (i = 0; i < lines && new_offset < last_byte; i++)
1307-            {
1308-                if (view->text_wrap_mode)
1309-                    new_offset =
1310-                        mcview_eol (view, view->dpy_start,
1311-                                    view->dpy_start + (off_t) view->data_area.width);
1312-                else
1313-                    new_offset = mcview_eol (view, view->dpy_start, last_byte);
1314-                if (new_offset < last_byte)
1315-                    view->dpy_start = new_offset;
1316-            }
1317-        }
1318+    } else {
1319+        mcview_ascii_move_down (view, lines);
1320     }
1321     mcview_movement_fixups (view, TRUE);
1322 }
1323@@ -257,7 +181,7 @@ mcview_move_left (mcview_t * view, off_t columns)
1324             if (old_cursor > 0 || view->hexedit_lownibble)
1325                 view->hexedit_lownibble = !view->hexedit_lownibble;
1326     }
1327-    else
1328+    else if (!view->text_wrap_mode)
1329     {
1330         if (view->dpy_text_column >= columns)
1331             view->dpy_text_column -= columns;
1332@@ -289,7 +213,7 @@ mcview_move_right (mcview_t * view, off_t columns)
1333             if (old_cursor < last_byte || !view->hexedit_lownibble)
1334                 view->hexedit_lownibble = !view->hexedit_lownibble;
1335     }
1336-    else
1337+    else if (!view->text_wrap_mode)
1338     {
1339         view->dpy_text_column += columns;
1340     }
1341@@ -302,6 +226,8 @@ void
1342 mcview_moveto_top (mcview_t * view)
1343 {
1344     view->dpy_start = 0;
1345+    view->dpy_paragraph_skip_lines = 0;
1346+    mcview_state_machine_init(&view->dpy_state_top, 0);
1347     view->hex_cursor = 0;
1348     view->dpy_text_column = 0;
1349     mcview_movement_fixups (view, TRUE);
1350@@ -331,6 +257,8 @@ mcview_moveto_bottom (mcview_t * view)
1351         const off_t datalines = view->data_area.height;
1352 
1353         view->dpy_start = filesize;
1354+        view->dpy_paragraph_skip_lines = 0;
1355+        view->dpy_wrap_dirty = TRUE;
1356         mcview_move_up (view, datalines);
1357     }
1358 }
1359@@ -347,6 +275,8 @@ mcview_moveto_bol (mcview_t * view)
1360     else if (!view->text_wrap_mode)
1361     {
1362         view->dpy_start = mcview_bol (view, view->dpy_start, 0);
1363+        view->dpy_paragraph_skip_lines = 0;
1364+        view->dpy_wrap_dirty = TRUE;
1365     }
1366     view->dpy_text_column = 0;
1367     mcview_movement_fixups (view, TRUE);
1368@@ -424,10 +354,14 @@ mcview_moveto_offset (mcview_t * view, off_t offset)
1369     {
1370         view->hex_cursor = offset;
1371         view->dpy_start = offset - offset % view->bytes_per_line;
1372+        view->dpy_paragraph_skip_lines = 0;
1373+        view->dpy_wrap_dirty = TRUE;
1374     }
1375     else
1376     {
1377         view->dpy_start = offset;
1378+        view->dpy_paragraph_skip_lines = 0;
1379+        view->dpy_wrap_dirty = TRUE;
1380     }
1381     mcview_movement_fixups (view, TRUE);
1382 }
1383@@ -498,9 +432,14 @@ mcview_moveto_match (mcview_t * view)
1384         view->hexedit_lownibble = FALSE;
1385         view->dpy_start = view->search_start - view->search_start % view->bytes_per_line;
1386         view->dpy_end = view->search_end - view->search_end % view->bytes_per_line;
1387+        view->dpy_paragraph_skip_lines = 0;
1388+        view->dpy_wrap_dirty = TRUE;
1389     }
1390-    else
1391+    else {
1392         view->dpy_start = mcview_bol (view, view->search_start, 0);
1393+        view->dpy_paragraph_skip_lines = 0;
1394+        view->dpy_wrap_dirty = TRUE;
1395+    }
1396 
1397     mcview_scroll_to_cursor (view);
1398     view->dirty++;
1399diff --git a/src/viewer/nroff.c b/src/viewer/nroff.c
1400index eb3a486..7905eb9 100644
1401--- a/src/viewer/nroff.c
1402+++ b/src/viewer/nroff.c
1403@@ -1,6 +1,6 @@
1404 /*
1405    Internal file viewer for the Midnight Commander
1406-   Function for nroff-like view
1407+   Functions for searching in nroff-like view
1408 
1409    Copyright (C) 1994-2014
1410    Free Software Foundation, Inc.
1411@@ -91,6 +91,7 @@ mcview_nroff_get_char (mcview_nroff_t * nroff, int *ret_val, off_t nroff_index)
1412 /*** public functions ****************************************************************************/
1413 /* --------------------------------------------------------------------------------------------- */
1414 
1415+#if 0  /* moved to ascii.c */
1416 void
1417 mcview_display_nroff (mcview_t * view)
1418 {
1419@@ -249,6 +250,7 @@ mcview_display_nroff (mcview_t * view)
1420     }
1421     view->dpy_end = from;
1422 }
1423+#endif
1424 
1425 /* --------------------------------------------------------------------------------------------- */
1426 
1427diff --git a/src/viewer/plain.c b/src/viewer/plain.c
1428deleted file mode 100644
1429index 8003f3a..0000000
1430--- a/src/viewer/plain.c
1431+++ /dev/null
1432@@ -1,207 +0,0 @@
1433-/*
1434-   Internal file viewer for the Midnight Commander
1435-   Function for plain view
1436-
1437-   Copyright (C) 1994-2014
1438-   Free Software Foundation, Inc.
1439-
1440-   Written by:
1441-   Miguel de Icaza, 1994, 1995, 1998
1442-   Janne Kukonlehto, 1994, 1995
1443-   Jakub Jelinek, 1995
1444-   Joseph M. Hinkle, 1996
1445-   Norbert Warmuth, 1997
1446-   Pavel Machek, 1998
1447-   Roland Illig <roland.illig@gmx.de>, 2004, 2005
1448-   Slava Zanko <slavazanko@google.com>, 2009
1449-   Andrew Borodin <aborodin@vmail.ru>, 2009-2014
1450-   Ilia Maslakov <il.smind@gmail.com>, 2009
1451-
1452-   This file is part of the Midnight Commander.
1453-
1454-   The Midnight Commander is free software: you can redistribute it
1455-   and/or modify it under the terms of the GNU General Public License as
1456-   published by the Free Software Foundation, either version 3 of the License,
1457-   or (at your option) any later version.
1458-
1459-   The Midnight Commander is distributed in the hope that it will be useful,
1460-   but WITHOUT ANY WARRANTY; without even the implied warranty of
1461-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
1462-   GNU General Public License for more details.
1463-
1464-   You should have received a copy of the GNU General Public License
1465-   along with this program.  If not, see <http://www.gnu.org/licenses/>.
1466- */
1467-
1468-#include <config.h>
1469-
1470-#include "lib/global.h"
1471-#include "lib/tty/tty.h"
1472-#include "lib/skin.h"
1473-#include "lib/util.h"           /* is_printable() */
1474-#ifdef HAVE_CHARSET
1475-#include "lib/charsets.h"
1476-#endif
1477-
1478-#include "src/setup.h"          /* option_tab_spacing */
1479-
1480-#include "internal.h"
1481-
1482-/*** global variables ****************************************************************************/
1483-
1484-/*** file scope macro definitions ****************************************************************/
1485-
1486-/*** file scope type declarations ****************************************************************/
1487-
1488-/*** file scope variables ************************************************************************/
1489-
1490-/*** file scope functions ************************************************************************/
1491-/* --------------------------------------------------------------------------------------------- */
1492-
1493-/* --------------------------------------------------------------------------------------------- */
1494-/*** public functions ****************************************************************************/
1495-/* --------------------------------------------------------------------------------------------- */
1496-
1497-void
1498-mcview_display_text (mcview_t * view)
1499-{
1500-    const screen_dimen left = view->data_area.left;
1501-    const screen_dimen top = view->data_area.top;
1502-    const screen_dimen width = view->data_area.width;
1503-    const screen_dimen height = view->data_area.height;
1504-    screen_dimen row = 0, col = 0;
1505-    off_t from;
1506-    int cw = 1;
1507-    int c, prev_ch = 0;
1508-    gboolean last_row = TRUE;
1509-    struct hexedit_change_node *curr = view->change_list;
1510-
1511-    mcview_display_clean (view);
1512-    mcview_display_ruler (view);
1513-
1514-    /* Find the first displayable changed byte */
1515-    from = view->dpy_start;
1516-    while ((curr != NULL) && (curr->offset < from))
1517-        curr = curr->next;
1518-
1519-    while (row < height)
1520-    {
1521-#ifdef HAVE_CHARSET
1522-        if (view->utf8)
1523-        {
1524-            gboolean read_res = TRUE;
1525-
1526-            c = mcview_get_utf (view, from, &cw, &read_res);
1527-            if (!read_res)
1528-                break;
1529-        }
1530-        else
1531-#endif
1532-        if (!mcview_get_byte (view, from, &c))
1533-            break;
1534-
1535-        last_row = FALSE;
1536-        from++;
1537-        if (cw > 1)
1538-            from += cw - 1;
1539-
1540-        if (c != '\n' && prev_ch == '\r')
1541-        {
1542-            if (++row >= height)
1543-                break;
1544-
1545-            col = 0;
1546-            /* tty_print_anychar ('\n'); */
1547-        }
1548-
1549-        prev_ch = c;
1550-        if (c == '\r')
1551-            continue;
1552-
1553-        if (c == '\n')
1554-        {
1555-            col = 0;
1556-            row++;
1557-            continue;
1558-        }
1559-
1560-        if (col >= width && view->text_wrap_mode)
1561-        {
1562-            col = 0;
1563-            if (++row >= height)
1564-                break;
1565-        }
1566-
1567-        if (c == '\t')
1568-        {
1569-            col += (option_tab_spacing - col % option_tab_spacing);
1570-            if (view->text_wrap_mode && col >= width && width != 0)
1571-            {
1572-                row += col / width;
1573-                col %= width;
1574-            }
1575-            continue;
1576-        }
1577-
1578-        if (view->search_start <= from && from < view->search_end)
1579-            tty_setcolor (SELECTED_COLOR);
1580-        else
1581-            tty_setcolor (VIEW_NORMAL_COLOR);
1582-
1583-        if (((off_t) col >= view->dpy_text_column)
1584-            && ((off_t) col - view->dpy_text_column < (off_t) width))
1585-        {
1586-            widget_move (view, top + row, left + ((off_t) col - view->dpy_text_column));
1587-
1588-#ifdef HAVE_CHARSET
1589-            if (mc_global.utf8_display)
1590-            {
1591-                if (!view->utf8)
1592-                    c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
1593-                if (!g_unichar_isprint (c))
1594-                    c = '.';
1595-            }
1596-            else if (view->utf8)
1597-                c = convert_from_utf_to_current_c (c, view->converter);
1598-            else
1599-            {
1600-                c = convert_to_display_c (c);
1601-                if (!is_printable (c))
1602-                    c = '.';
1603-            }
1604-#else /* HAVE_CHARSET */
1605-            if (!is_printable (c))
1606-                c = '.';
1607-#endif /* HAVE_CHARSET */
1608-
1609-            tty_print_anychar (c);
1610-        }
1611-
1612-        col++;
1613-
1614-#ifdef HAVE_CHARSET
1615-        if (view->utf8)
1616-        {
1617-            if (g_unichar_iswide (c))
1618-                col++;
1619-            else if (g_unichar_iszerowidth (c))
1620-                col--;
1621-        }
1622-#endif
1623-    }
1624-
1625-    view->dpy_end = from;
1626-    if (mcview_show_eof != NULL && mcview_show_eof[0] != '\0')
1627-    {
1628-        if (last_row && mcview_get_byte (view, from - 1, &c) && c != '\n')
1629-            row--;
1630-
1631-        while (++row < height)
1632-        {
1633-            widget_move (view, top + row, left);
1634-            tty_print_string (mcview_show_eof);
1635-        }
1636-    }
1637-}
1638-
1639-/* --------------------------------------------------------------------------------------------- */
1640diff --git a/tests/src/viewer/viewertest.txt b/tests/src/viewer/viewertest.txt
1641new file mode 100644
1642index 0000000..6a0a299
1643--- /dev/null
1644+++ b/tests/src/viewer/viewertest.txt
1645@@ -0,0 +1,76 @@
1646+* LF as line terminator
1647+This row has 79 columns:    30        40        50        60        70       79
1648+This row has 80 columns:    30        40        50        60        70        80
1649+This row has 81 columns:    30        40        50        60        70         81
1650+
1651+* CR as line terminator
1652This row has 79 columns:    30        40        50        60        70       79
1653This row has 80 columns:    30        40        50        60        70        80
1654This row has 81 columns:    30        40        50        60        70         81
1655
1656* CR+LF as line terminator
1657+This row has 79 columns:    30        40        50        60        70       79
1658+This row has 80 columns:    30        40        50        60        70        80
1659+This row has 81 columns:    30        40        50        60        70         81
1660+
1661+* TAB characters of varying widths (with reference rendering above)
1662+88888888········7·······66······555·····4444····33333···222222··1111111·|
1663+88888888       7       66      555     4444    33333   222222  1111111 |
1664+
1665+* Combining accents on top of every second letter (a, c, ...)
1666+---------------------------------------------------|
1667+ÁBC̀DÉFG̀HÍJK̀LḾNÒPQ́RS̀TÚVẀXÝzỳxẃvùtśrq̀pónm̀lḱjìhǵfèdćbà|
1668+
1669+* More and more combining accents on a single character
1670+---  ---  ---  ---  ---  ---|
1671+0:a  1:à  2:à́  3:à́̂  4:à́̂̃  5:à́̂̃̄|
1672+0:x  1:x̀  2:x̀́  3:x̀́̂  4:x̀́̂̃  5:x̀́̂̃̄|
1673+
1674+* Combining accents at beginning of line, and after tab, with
1675+  reference rendering (explicit dotted circles, and spaces) above
1676+-       -       -       -       -|
1677+◌̀́       ◌̀́       ◌̀́       ◌̀́       ◌̀́|
1678+̀́     ̀́      ̀́      ̀́      ̀́|
1679+
1680+* Same with spacing mark
1681+一      一      一      一      一|
1682+◌ो      ◌ो      ◌ो      ◌ो      ◌ो|
1683+ो      ो       ो       ो       ो|
1684+
1685+* CJK, Lorem ipsum by Google translate, second line shifted by a space.
1686+  When wrapped, the trailing bars will not align
1687+のイプサム嘆き、の痛みに座るが、時折状況が労苦と痛みが彼にいくつかの大きな喜びを調達することができる起こるので。 |
1688+ のイプサム嘆き、の痛みに座るが、時折状況が労苦と痛みが彼にいくつかの大きな喜びを調達することができる起こるので。|
1689+
1690+* Devanagari spacing marks, with reference positions. Just as with CJK,
1691+  the two cells should appear/disappear together
1692+一 一 一 一|一  一  一  一|一   一   一   一|一    一    一    一|
1693+हो हि हो हि|हो  हि  हो  हि|हो   हि   हो   हि|हो    हि    हो    हि|
1694+
1695+* Thai Sara Am
1696+-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --|
1697+aำ bำ cำ dำ eำ fำ gำ hำ iำ jำ kำ lำ mำ nำ oำ pำ qำ rำ sำ tำ uำ vำ wำ xำ yำ zำ|
1698+
1699+* TABs mixed with other wierd characters
1700+-----   -       一      一一一一        --      -- --   |
1701+abcde          の      イプサム    à́b̀́  हो हि   |
1702+
1703+* Extreme stress test: base letter with multiple (c)ombining or (s)pacing marks
1704+---  -------  ----  ------------  -----------  -----------  -----------  -----------|
1705+c:x̀  ccccc:x̀́̂̃̄  s:xो  sssss:xोोोोो  sssccc:xोोो̀́̂  cccsss:x̀́̂ोोो  scscsc:xो̀ो́ो̂  cscscs:x̀ो́ो̂ो|
1706+
1707+* Same as above, but with CJK base char
1708+--一  ------一  --一-  ------一-----  -------一---  -------一---  -------一---  -------一---|
1709+c:の̀  ccccc:の̀́̂̃̄  s:のो  sssss:のोोोोो  sssccc:のोोो̀́̂  cccsss:の̀́̂ोोो  scscsc:のो̀ो́ो̂  cscscs:の̀ो́ो̂ो|
1710+
1711+* Nroff
1712+---------------  一一一一  - - - -  一 一 一 一  -----------------|
1713+_Hello,_World!_  のイプサ  à́ b̀́ c̀́ d̀́  हो हि हो हि  __b___u___b___u__|
1714+__HHeelllloo,,___W_o_r_l_d_!__  ののイイ_プ_サ  aà̀́́ bb̀̀́́ _c_̀_́ _d_̀_́  हहोो हहिि _ह_ो _ह_ि  ____bb_______u______bb_______u____|
1715+______ <- should be bold again
1716+
1717+* Invalid nroff (a backspace b tab backspace tab underscore backspace newline,
1718+  reference rendering in the first row)
1719+a.b     .       _.
1720+ab           _
1721+
1722+* Control characters (00-1F except tab/lf/cr, 7F, 80-9F), should all be replaced by dots
1723+@ABCDEFGH--KL-NOPQRSTUVWXYZ[\]^_|?|@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_|
1724+  ||€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ|
1725+
1726+* Invalid UTF-8 not tested here, use Markus Kuhn's stress test