Context Navigation

Back to Ticket #3250

Ticket #3250: mc-3250-viewer-rewrite-v2.patch

File mc-3250-viewer-rewrite-v2.patch, 71.1 KB (added by egmont, 10 years ago)
Reimplementation, v2

AUTHORS

From ae1ec2bf64096f7960ea2fc101f04a816178d69a Mon Sep 17 00:00:00 2001
From: Egmont Koblinger <egmont@gmail.com>
Date: Sat, 13 Sep 2014 22:33:46 +0200
Subject: [PATCH] 3250-v2

---
 AUTHORS                         |   1 +
 src/viewer/Makefile.am          |   2 +-
 src/viewer/actions_cmd.c        |   3 +
 src/viewer/ascii.c              | 991 ++++++++++++++++++++++++++++++++++++++++
 src/viewer/datasource.c         |   8 +-
 src/viewer/display.c            |   4 -
 src/viewer/internal.h           |  32 +-
 src/viewer/lib.c                |  13 +-
 src/viewer/mcviewer.c           |   7 +
 src/viewer/move.c               | 114 ++---
 src/viewer/nroff.c              | 158 +------
 src/viewer/plain.c              | 204 ---------
 tests/src/viewer/viewertest.txt | Bin 0 -> 4680 bytes
 13 files changed, 1073 insertions(+), 464 deletions(-)
 create mode 100644 src/viewer/ascii.c
 delete mode 100644 src/viewer/plain.c
 create mode 100644 tests/src/viewer/viewertest.txt

diff --git a/AUTHORS b/AUTHORS
index bb85c83..60ef7f7 100644

  Egmont Koblinger <egmont@gmail.com>
         Support of extended mouse clicks beyond 223 column
         Support of bracketed paste mode of xterm
                 (http://invisible-island.net/xterm/ctlseqs/ctlseqs.html#Bracketed%20Paste%20Mode)
+        Rewritten viewer
 Erwin van Eijk <wabbit@corner.iaf.nl>

src/viewer/Makefile.am

diff --git a/src/viewer/Makefile.am b/src/viewer/Makefile.am
index 53bc7a4..0602084 100644

  noinst_LTLIBRARIES = libmcviewer.la
 libmcviewer_la_SOURCES = \
         actions_cmd.c \
+        ascii.c \
         coord_cache.c \
         datasource.c \
         dialogs.c \
-…
+ libmcviewer_la_SOURCES = \
         mcviewer.h \
         move.c \
         nroff.c \
-        plain.c \
         search.c
 AM_CPPFLAGS = -I$(top_srcdir) $(GLIB_CFLAGS) $(PCRE_CPPFLAGS)

src/viewer/actions_cmd.c

diff --git a/src/viewer/actions_cmd.c b/src/viewer/actions_cmd.c
index 8df149e..6b69e66 100644

  mcview_execute_cmd (mcview_t * view, unsigned long command)
         break;
     case CK_Bookmark:
         view->dpy_start = view->marks[view->marker];
+        view->dpy_paragraph_skip_lines = 0;     /* TODO: remember this value in the marker? */
+        view->dpy_wrap_dirty = TRUE;
         view->dirty++;
         break;
 #ifdef HAVE_CHARSET
-…
+ mcview_adjust_size (WDialog * h)
     widget_set_size (WIDGET (view), 0, 0, LINES - 1, COLS);
     widget_set_size (WIDGET (b), LINES - 1, 0, 1, COLS);
+    view->dpy_wrap_dirty = TRUE;
     mcview_compute_areas (view);
     mcview_update_bytes_per_line (view);
+}

new file src/viewer/ascii.c

diff --git a/src/viewer/ascii.c b/src/viewer/ascii.c
new file mode 100644
index 0000000..57bf307

-                      -
+/*
+   Internal file viewer for the Midnight Commander
+   Function for plain view
+   Copyright (C) 1994-2014
+   Free Software Foundation, Inc.
+   Written by:
+   Miguel de Icaza, 1994, 1995, 1998
+   Janne Kukonlehto, 1994, 1995
+   Jakub Jelinek, 1995
+   Joseph M. Hinkle, 1996
+   Norbert Warmuth, 1997
+   Pavel Machek, 1998
+   Roland Illig <roland.illig@gmx.de>, 2004, 2005
+   Slava Zanko <slavazanko@google.com>, 2009
+   Andrew Borodin <aborodin@vmail.ru>, 2009-2014
+   Ilia Maslakov <il.smind@gmail.com>, 2009
+   Rewritten almost from scratch by:
+   Egmont Koblinger <egmont@gmail.com>, 2014
+   This file is part of the Midnight Commander.
+   The Midnight Commander is free software: you can redistribute it
+   and/or modify it under the terms of the GNU General Public License as
+   published by the Free Software Foundation, either version 3 of the License,
+   or (at your option) any later version.
+   The Midnight Commander is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+   ------------------------------------------------------------------------------------------------
+   The viewer is implemented along the following design principles:
+   Goals: Always display simple scripts, double wide (CJK), combining accents and spacing marks
+   (often used e.g. in Devanagari) perfectly. Make the arrow keys always work correctly.
+   Absolutely non-goal: RTL.
+   Terminology:
+   - A "paragraph" is the text between two adjacent newline characters. A "line" or "row" is a
+   visual row on the screen. In wrap mode, the viewer formats a paragraph into one or more lines.
+   - The Unicode glossary <http://www.unicode.org/glossary/> doesn't seem to have a notion of "base
+   character followed by zero or more combining characters". The closest matches are "Combining
+   Character Sequence" meaning a base character followed by one or more combining characters, or
+   "Grapheme" which seems to exclude non-printable characters such as newline. In this file,
+   "combining character sequence" (or any obvious abbreviation thereof) means a base character
+   followed by zero or more (up to a current limit of 4) combining characters.
+   ------------------------------------------------------------------------------------------------
+   The parser-formatter is designed to be stateless across paragraphs. This is so that we can walk
+   backwards without having to reparse the whole file (although we still need to reparse and
+   reformat the whole paragraph, but it's a lot better). This principle needs to be changed if we
+   ever get to address tickets 1849/2977, but then we can still store (for efficiency) the parser
+   state at the beginning of the paragraph, and safely walk backwards if we don't cross an escape
+   character.
+   The parser-formatter, however, definitely needs to carry a state across lines. Currently this
+   state contains:
+   - The logical column (as if we didn't wrap). This is used for handling TAB characters after a
+   wordwrap consistently with less.
+   - Whether the last nroff character was bold or underlined. This is used for displaying the
+   ambiguous _\b_ sequence consistently with less.
+   - Whether the desired way of displaying a lonely combining accent or spacing mark is to place it
+   over a dotted circle (we do this at the beginning of the paragraph of after a TAB), or to ignore
+   the combining char and show replacement char for the spacing mark (we do this if e.g. too many
+   of these were encountered and hence we don't glue them with their base character).
+   - (This state needs to be expanded if e.g. we decide to print verbose replacement characters
+   (e.g. "<U+0080>") and allow these to wrap around lines.)
+   The state also contains the file offset, as it doesn't make sense to ever know the state without
+   knowing the corresponding offset.
+   The state depends on various settings (viewer width, encoding, nroff mode, charwrap or wordwrap
+   mode (if we'll have that one day) etc.), needs to be recomputed if any of these changes.
+   Walking forwards is usually relatively easy both in the file and on the screen. Walking
+   backwards within a paragraph would only be possible in some special cases and even then it would
+   be painful, so we always walk back to the beginning of the paragraph and reparse-reformat from
+   there.
+   (Walking back within a line in the file would have at least the following difficulties: handling
+   the parser state; processing invalid UTF-8; processing invalid nroff (e.g. what is "_\bA\bA"?).
+   Walking back on the display: we wouldn't know where to display the last line of a paragraph, or
+   where to display a line if its following line starts with a wide (CJK or Tab) character. Long
+   story short: just forget this approach.)
+   Most important variables:
+   - dpy_start: Both in unwrap and wrap modes this points to the beginning of the topmost displayed
+   paragraph.
+   - dpy_text_column: Only in unwrap mode, an additional horizontal scroll.
+   - dpy_paragraph_skip_lines: Only in wrap mode, an additional vertical scroll (the number of
+   lines that are scrolled off at the top from the topmost paragraph).
+   - dpy_state_top: Only in wrap mode, the offset and parser-formatter state at the line where
+   displaying the file begins is cached here.
+   - dpy_wrap_dirty: If some parameter has changed that makes it necessary to reparse-redisplay the
+   topmost paragraph.
+   In wrap mode, the three variables "dpy_start", "dpy_paragraph_skip_lines" and "dpy_state_top"
+   are kept consistent. Think of the first two as the ones describing the position, and the third
+   as a cached value for better performance so that we don't need to wrap the invisible beginning
+   of the topmost paragraph over and over again. The third value needs to be recomputed each time a
+   parameter that influences parsing or displaying the file (e.g. width of screen, encoding, nroff
+   mode) changes, this is signaled by "dpy_wrap_dirty" to force recomputing "dpy_state_top" (and
+   clamp "dpy_paragraph_skip_lines" if necessary).
+   ------------------------------------------------------------------------------------------------
+   Help integration
+   I'm planning to port the help viewer to this codebase.
+   Splitting at sections would still happen in the help viewer. It would either copy a section, or
+   set force_max and a similar force_min to limit displaying to one section only.
+   Parsing the help format would go next to the nroff parser. The colors, alternate character set,
+   and emitting the version number would go to the "state". (The version number would be
+   implemented by emitting remaining characters of a buffer in the "state" one by one, without
+   advancing in the file position.)
+   The active link would be drawn similarly to the search highlight. Other than that, the viewer
+   wouldn't care about links (except for their color). help.c would keep track of which one is
+   highlighted, how to advance to the next/prev on an arrow, how the scroll offset needs to be
+   adjusted when moving, etc.
+   Add wrapping at word boundaries to where wrapping at char boundaries happen now.
+ */
+#include <config.h>
+#include "lib/global.h"
+#include "lib/tty/tty.h"
+#include "lib/skin.h"
+#include "lib/util.h"           /* is_printable() */
+#ifdef HAVE_CHARSET
+#include "lib/charsets.h"
+#endif
+#include "src/setup.h"          /* option_tab_spacing */
+#include "internal.h"
+/*** global variables ****************************************************************************/
+/*** file scope macro definitions ****************************************************************/
+/* The Unicode standard recommends that lonely combining characters are printed over a dotted
+ * circle. If the terminal is not UTF-8, this will be replaced by a dot anyway. */
+#define BASE_CHARACTER_FOR_LONELY_COMBINING 0x25CC      /* dotted circle */
+#define MAX_COMBINING_CHARS 4   /* both slang and ncurses support exactly 4 */
+/* I think anything other than space (e.g. arrows) just introduce the visual clutter without
+ * actually adding value. */
+#define PARTIAL_CJK_AT_LEFT_MARGIN  ' '
+#define PARTIAL_CJK_AT_RIGHT_MARGIN ' '
+/*
+ * Wrap mode: This is for safety so that jumping to the end of file (which already includes
+ * scrolling back by a page) and then walking backwards is reasonably fast, even if the file is
+ * extremely large and consists of maybe full zeros or something like that. If there's no newline
+ * found within this limit, just start displaying from there and see what happens. We might get
+ * some displaying parameteres (most importantly the columns) incorrect, but at least will show the
+ * file without spinning the CPU for ages. When scrolling back to that point, the user might see a
+ * garbled first line (even starting with an invalid partial UTF-8), but then walking back by yet
+ * another line should fix it.
+ *
+ * Unwrap mode: This is not used, we wouldn't be able to do anything reasonable without walking
+ * back a whole paragraph (well, view->data_area.height paragraphs actually).
+ */
+#define MAX_BACKWARDS_WALK_IN_PARAGRAPH (100 * 1000)
+/*** file scope type declarations ****************************************************************/
+/*** file scope variables ************************************************************************/
+/*** file scope functions ************************************************************************/
+/* TODO: These methods shouldn't be necessary, see ticket 3257 */
+static int
+mcview_wcwidth (const mcview_t * view, int c)
+{
+#ifdef HAVE_CHARSET
+    if (view->utf8)
+    {
+        if (g_unichar_iswide (c))
+            return 2;
+        if (g_unichar_iszerowidth (c))
+            return 0;
+    }
+#endif /* HAVE_CHARSET */
+    return 1;
+}
+static gboolean
+mcview_ismark (const mcview_t * view, int c)
+{
+#ifdef HAVE_CHARSET
+    if (view->utf8)
+        return g_unichar_ismark (c);
+#endif /* HAVE_CHARSET */
+    return FALSE;
+}
+/* actually is_non_spacing_mark_or_enclosing_mark */
+static gboolean
+mcview_is_non_spacing_mark (const mcview_t * view, int c)
+{
+#ifdef HAVE_CHARSET
+    if (view->utf8)
+    {
+        GUnicodeType type = g_unichar_type (c);
+        return type == G_UNICODE_NON_SPACING_MARK || type == G_UNICODE_ENCLOSING_MARK;
+    }
+#endif /* HAVE_CHARSET */
+    return FALSE;
+}
+#if 0
+static gboolean
+mcview_is_spacing_mark (const mcview_t * view, int c)
+{
+#ifdef HAVE_CHARSET
+    if (view->utf8)
+    {
+        return g_unichar_type (c) == G_UNICODE_SPACING_MARK;
+    }
+#endif /* HAVE_CHARSET */
+    return FALSE;
+}
+#endif /* 0 */
+static gboolean
+mcview_isprint (const mcview_t * view, int c)
+{
+#ifdef HAVE_CHARSET
+    if (!view->utf8)
+        c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
+    return g_unichar_isprint (c);
+#endif /* HAVE_CHARSET */
+    /* TODO this is very-very buggy by design: ticket 3257 comments 0-1 */
+    return is_printable (c);
+}
+static int
+mcview_char_display (const mcview_t * view, int c, char *s)
+{
+#ifdef HAVE_CHARSET
+    if (mc_global.utf8_display)
+    {
+        if (!view->utf8)
+            c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
+        if (!g_unichar_isprint (c))
+            c = '.';
+        return g_unichar_to_utf8 (c, s);
+    }
+    else if (view->utf8)
+    {
+        if (g_unichar_iswide (c))
+        {
+            s[0] = s[1] = '.';
+            return 2;
+        }
+        if (g_unichar_iszerowidth (c))
+            return 0;
+        /* TODO the is_printable check below will be broken for this */
+        c = convert_from_utf_to_current_c (c, view->converter);
+    }
+    else
+    {
+        /* TODO the is_printable check below will be broken for this */
+        c = convert_to_display_c (c);
+    }
+#endif /* HAVE_CHARSET */
+    /* TODO this is very-very buggy by design: ticket 3257 comments 0-1 */
+    if (!is_printable (c))
+        c = '.';
+    *s = c;
+    return 1;
+}
+/* --------------------------------------------------------------------------------------------- */
+/*
+ * Just for convenience, a common interface in front of mcview_get_utf and mcview_get_byte, so that
+ * the caller doesn't have to care about utf8 vs 8-bit modes.
+ *
+ * Normally: stores c, updates state, returns TRUE.
+ * At EOF: state is unchanged, c is undefined, returns FALSE.
+ *
+ * Also, temporary hack: handle force_max here.
+ * TODO: move it to lower layers (datasource.c)?
+ */
+static gboolean
+mcview_get_next_char (mcview_t * view, mcview_state_machine_t * state, int *c)
+{
+    gboolean result;
+    int bytes_consumed;
+    /* Pretend EOF if we reached force_max */
+    if (view->force_max >= 0 && state->offset >= view->force_max)
+    {
+        return FALSE;
+    }
+#ifdef HAVE_CHARSET
+    if (view->utf8)
+    {
+        *c = mcview_get_utf (view, state->offset, &bytes_consumed, &result);
+        if (!result)
+            return FALSE;
+        /* Pretend EOF if we crossed force_max */
+        if (view->force_max >= 0 && state->offset + bytes_consumed > view->force_max)
+        {
+            return FALSE;
+        }
+        state->offset += bytes_consumed;
+        return TRUE;
+    }
+#endif /* HAVE_CHARSET */
+    if (!mcview_get_byte (view, state->offset, c))
+        return FALSE;
+    state->offset++;
+    return TRUE;
+}
+/*
+ * This function parses the next nroff character and gives it to you along with its desired color,
+ * so you never have to care about nroff again.
+ *
+ * The nroff mode does the backspace trick for every single character (Unicode codepoint). At least
+ * that's what the GNU groff 1.22 package produces, and that's what less 458 expects. For
+ * double-wide characters (CJK), still only a single backspace is emitted. For combining accents
+ * and such, the print-backspace-print step is repeated for the base character and then for each
+ * accent separately.
+ *
+ * So, the right place for this layer is after the bytes are interpreted in UTF-8, but before
+ * joining a base character with its combining accents.
+ *
+ * Normally: stores c and color, updates state, returns TRUE.
+ * At EOF: state is unchanged, c and color are undefined, returns FALSE.
+ *
+ * color can be null if the caller doesn't care.
+ */
+static gboolean
+mcview_get_next_maybe_nroff_char (mcview_t * view, mcview_state_machine_t * state, int *c,
+                                  int *color)
+{
+    mcview_state_machine_t state_after_nroff;
+    int c2, c3;
+    if (color != NULL)
+        *color = VIEW_NORMAL_COLOR;
+    if (!view->text_nroff_mode)
+        return mcview_get_next_char (view, state, c);
+    if (!mcview_get_next_char (view, state, c))
+        return FALSE;
+    /* Don't allow nroff formatting around CR, LF, TAB or other special chars */
+    if (!mcview_isprint (view, *c))
+        return TRUE;
+    state_after_nroff = *state;
+    if (!mcview_get_next_char (view, &state_after_nroff, &c2))
+        return TRUE;
+    if (c2 != '\b')
+        return TRUE;
+    if (!mcview_get_next_char (view, &state_after_nroff, &c3))
+        return TRUE;
+    if (!mcview_isprint (view, c3))
+        return TRUE;
+    if (*c == '_' && c3 == '_')
+    {
+        *state = state_after_nroff;
+        if (color != NULL)
+            *color =
+                state->nroff_underscore_is_underlined ? VIEW_UNDERLINED_COLOR : VIEW_BOLD_COLOR;
+        return TRUE;
+    }
+    else if (*c == c3)
+    {
+        *state = state_after_nroff;
+        state->nroff_underscore_is_underlined = FALSE;
+        if (color != NULL)
+            *color = VIEW_BOLD_COLOR;
+        return TRUE;
+    }
+    else if (*c == '_')
+    {
+        *c = c3;
+        *state = state_after_nroff;
+        state->nroff_underscore_is_underlined = TRUE;
+        if (color != NULL)
+            *color = VIEW_UNDERLINED_COLOR;
+        return TRUE;
+    }
+    else
+    {
+        return TRUE;
+    }
+}
+/*
+ * Get one base character, along with its combining or spacing mark characters.
+ *
+ * (A spacing mark is a character that extends the base character's width 1 into a combined
+ * character of width 2, yet these two character cells should not be separated. E.g. Devanagari
+ * <U+0939><U+094B>.)
+ *
+ * This method exists mainly for two reasons. One is to be able to tell if we fit on the current
+ * line or need to wrap to the next one. The other is that both slang and ncurses seem to require
+ * that the character and its combining marks are printed in a single call (or is it just a
+ * limitation of mc's wrapper to them?).
+ *
+ * For convenience, this method takes care of converting CR or CR+LF into LF.
+ * TODO this should probably happen later, when displaying the file?
+ *
+ * Normally: stores cs and color, updates state, returns >= 1 (entries in cs).
+ * At EOF: state is unchanged, cs and color are undefined, returns 0.
+ *
+ * @param view ...
+ * @param state the parser-formatter state machine's state, updated
+ * @param cs store the characters here
+ * @param clen the room available in cs (that is, at most clen-1 combining marks are allowed), must
+ *   be at least 2
+ * @param color if non-NULL, store the color here, taken from the first codepoint's color
+ * @return the number of entries placed in cs, or 0 on EOF
+ */
+static int
+mcview_next_combining_char_sequence (mcview_t * view, mcview_state_machine_t * state, int *cs,
+                                     int clen, int *color)
+{
+    int i = 1;
+    mcview_state_machine_t state_after_combining;
+    if (!mcview_get_next_maybe_nroff_char (view, state, cs, color))
+        return 0;
+    /* Process \r and \r\n newlines. */
+    if (cs[0] == '\r')
+    {
+        int cnext;
+        mcview_state_machine_t state_after_crlf = *state;
+        if (mcview_get_next_maybe_nroff_char (view, &state_after_crlf, &cnext, NULL)
+            && cnext == '\n')
+            *state = state_after_crlf;
+        cs[0] = '\n';
+        return 1;
+    }
+    /* We don't want combining over non-printable characters. This includes '\n' and '\t' too. */
+    if (!mcview_isprint (view, cs[0]))
+        return 1;
+    if (mcview_ismark (view, cs[0]))
+    {
+        if (!state->print_lonely_combining)
+        {
+            /* First character is combining. Either just return it, ... */
+            return 1;
+        }
+        else
+        {
+            /* or place this (and subsequent combining ones) over a dotted circle. */
+            cs[1] = cs[0];
+            cs[0] = BASE_CHARACTER_FOR_LONELY_COMBINING;
+            i = 2;
+        }
+    }
+    if (mcview_wcwidth (view, cs[0]) == 2)
+    {
+        /* Don't allow combining or spacing mark for wide characters, is this okay? */
+        return 1;
+    }
+    /* Look for more combining chars. Either at most clen-1 zero-width combining chars,
+     * or at most 1 spacing mark. Is this logic correct? */
+    for (; i < clen; i++)
+    {
+        state_after_combining = *state;
+        if (!mcview_get_next_maybe_nroff_char (view, &state_after_combining, &cs[i], NULL))
+            return i;
+        if (!mcview_ismark (view, cs[i]) || !mcview_isprint (view, cs[i]))
+            return i;
+        if (g_unichar_type (cs[i]) == G_UNICODE_SPACING_MARK)
+        {
+            /* Only allow as the first combining char. Stop processing in either case. */
+            if (i == 1)
+            {
+                *state = state_after_combining;
+                i++;
+            }
+            return i;
+        }
+        *state = state_after_combining;
+    }
+    return i;
+}
+/*
+ * Parse, format and possibly display one visual line of text.
+ *
+ * Formatting starts at the given "state" (which encodes the file offset and parser and formatter's
+ * internal state). In unwrap mode, this should point to the beginning of the paragraph with the
+ * default state, the additional horizontal scrolling is added here. In wrap mode, this should
+ * point to the beginning of the line, with the proper state at that point.
+ *
+ * In wrap mode, if a line ends in a newline, it is consumed, even if it's exactly at the right
+ * edge. In unwrap mode, the whole remaining line, including the newline is consumed. Displaying
+ * the next line should start at "state"'s new value, or if we displayed the bottom line then
+ * state->offset tells the file offset to be shown in the top bar.
+ *
+ * If "row" is offscreen, don't actually display the line but still update "state" and return the
+ * proper value. This is used by mcview_wrap_move_down to advance in the file.
+ *
+ * @param view ...
+ * @param state the parser-formatter state machine's state, updated
+ * @param row print to this row
+ * @param paragraph_ended store TRUE if paragraph ended by newline or EOF, FALSE if wraps to next
+ *   line
+ * @return the number of rows, that is, 0 if we were already at EOF, otherwise 1
+ */
+static int
+mcview_display_line (mcview_t * view, mcview_state_machine_t * state, int row,
+                     gboolean * paragraph_ended)
+{
+    const screen_dimen left = view->data_area.left;
+    const screen_dimen top = view->data_area.top;
+    const screen_dimen width = view->data_area.width;
+    const screen_dimen height = view->data_area.height;
+    off_t dpy_text_column = view->text_wrap_mode ? 0 : view->dpy_text_column;
+    screen_dimen col = 0;
+    int color;
+    int cs[1 + MAX_COMBINING_CHARS];
+    int n;
+    char str[(1 + MAX_COMBINING_CHARS) * UTF8_CHAR_LEN + 1];
+    int charwidth;
+    int i, j;
+    mcview_state_machine_t state_saved;
+    if (paragraph_ended != NULL)
+        *paragraph_ended = TRUE;
+    if (!view->text_wrap_mode && col >= dpy_text_column + width)
+    {
+        /* Optimization: Fast forward to the end of the line, rather than carefully
+         * parsing and then not actually displaying it. */
+        off_t eol = mcview_eol (view, state->offset, mcview_get_filesize (view));
+        int retval = eol > state->offset ? 1 : 0;
+        mcview_state_machine_init (state, eol);
+        return retval;
+    }
+    while (1)
+    {
+        state_saved = *state;
+        n = mcview_next_combining_char_sequence (view, state, cs, 1 + MAX_COMBINING_CHARS, &color);
+        if (n == 0)
+            return col > 0 ? 1 : 0;
+        if (view->search_start <= state->offset && state->offset < view->search_end)
+            color = SELECTED_COLOR;
+        if (cs[0] == '\n')
+        {
+            /* New line: reset all formatting state for the next paragraph. */
+            mcview_state_machine_init (state, state->offset);
+            return 1;
+        }
+        if (mcview_is_non_spacing_mark (view, cs[0]))
+        {
+            /* Lonely combining character. Probably leftover after too many combining chars. Just ignore. */
+            continue;
+        }
+        /* Nonprintable, or lonely spacing mark */
+        if ((!mcview_isprint (view, cs[0]) || mcview_ismark (view, cs[0])) && cs[0] != '\t')
+            cs[0] = '.';
+        charwidth = 0;
+        for (i = 0; i < n; i++)
+        {
+            charwidth += mcview_wcwidth (view, cs[i]);
+        }
+        /* Adjust the width for TAB. It's handled below along with the normal characters,
+         * so that it's wrapped consistently with them, and is painted with the proper
+         * attributes (although currently it can't have a special color). */
+        if (cs[0] == '\t')
+        {
+            charwidth = option_tab_spacing - state->unwrapped_column % option_tab_spacing;
+            state->print_lonely_combining = TRUE;
+        }
+        else
+        {
+            state->print_lonely_combining = FALSE;
+        }
+        /* In wrap mode only: We're done with this row if the character sequence wouldn't fit.
+         * Except if at the first column, because then it wouldn't fit in the next row either.
+         * In this extreme case let the unwrapped code below do its best to display it. */
+        if (view->text_wrap_mode && (off_t) col + charwidth > dpy_text_column + width && col > 0)
+        {
+            *state = state_saved;
+            if (paragraph_ended != NULL)
+                *paragraph_ended = FALSE;
+            return 1;
+        }
+        /* Display, unless outside of the viewport. */
+        if (row >= 0 && row < (int) height)
+        {
+            if ((off_t) col >= dpy_text_column &&
+                (off_t) col + charwidth <= dpy_text_column + width)
+            {
+                /* The combining character sequence fits entirely in the viewport. Print it. */
+                tty_setcolor (color);
+                widget_move (view, top + row, left + ((off_t) col - dpy_text_column));
+                if (cs[0] == '\t')
+                {
+                    for (i = 0; i < charwidth; i++)
+                        tty_print_char (' ');
+                }
+                else
+                {
+                    j = 0;
+                    for (i = 0; i < n; i++)
+                    {
+                        j += mcview_char_display (view, cs[i], str + j);
+                    }
+                    str[j] = '\0';
+                    /* This is probably a bug in our tty layer, but tty_print_string
+                     * normalizes the string, whereas tty_printf doesn't. Don't normalize,
+                     * since we handle combining characters ourselves correctly, it's
+                     * better if they are copy-pasted correctly. Ticket 3255. */
+                    tty_printf ("%s", str);
+                }
+            }
+            else if ((off_t) col < dpy_text_column && (off_t) col + charwidth > dpy_text_column)
+            {
+                /* The combining character sequence would cross the left edge of the viewport.
+                 * This cannot happen with wrap mode. Print replacement character(s),
+                 * or spaces with the correct attributes for partial Tabs. */
+                tty_setcolor (color);
+                for (i = dpy_text_column;
+                     i < (off_t) col + charwidth && i < dpy_text_column + width; i++)
+                {
+                    widget_move (view, top + row, left + (i - dpy_text_column));
+                    tty_print_anychar (cs[0] == '\t' ? ' ' : PARTIAL_CJK_AT_LEFT_MARGIN);
+                }
+            }
+            else if ((off_t) col < dpy_text_column + width &&
+                     (off_t) col + charwidth > dpy_text_column + width)
+            {
+                /* The combining character sequence would cross the right edge of the viewport
+                 * and we're not wrapping. Print replacement character(s),
+                 * or spaces with the correct attributes for partial Tabs. */
+                tty_setcolor (color);
+                for (i = col; i < dpy_text_column + width; i++)
+                {
+                    widget_move (view, top + row, left + (i - dpy_text_column));
+                    tty_print_anychar (cs[0] == '\t' ? ' ' : PARTIAL_CJK_AT_RIGHT_MARGIN);
+                }
+            }
+        }
+        col += charwidth;
+        state->unwrapped_column += charwidth;
+        if (!view->text_wrap_mode && col >= dpy_text_column + width)
+        {
+            /* Optimization: Fast forward to the end of the line, rather than carefully
+             * parsing and then not actually displaying it. */
+            off_t eol = mcview_eol (view, state->offset, mcview_get_filesize (view));
+            mcview_state_machine_init (state, eol);
+            return 1;
+        }
+    }
+}
+/*
+ * Parse, format and possibly display one paragraph (perhaps not from the beginning).
+ *
+ * Formatting starts at the given "state" (which encodes the file offset and parser and formatter's
+ * internal state). In unwrap mode, this should point to the beginning of the paragraph with the
+ * default state, the additional horizontal scrolling is added here. In wrap mode, this may point
+ * to the beginning of the line within a paragraph (to display the partial paragraph at the top),
+ * with the proper state at that point.
+ *
+ * Displaying the next paragraph should start at "state"'s new value, or if we displayed the bottom
+ * line then state->offset tells the file offset to be shown in the top bar.
+ *
+ * If "row" is negative, don't display the first abs(row) lines and display the rest from the top.
+ * This was a nice idea but it's now unused :)
+ *
+ * If "row" is too large, don't display the paragraph at all but still return the number of lines.
+ * This is used when moving upwards.
+ *
+ * @param view ...
+ * @param state the parser-formatter state machine's state, updated
+ * @param row print starting at this row
+ * @return the number of rows the paragraphs is wrapped to, that is, 0 if we were already at EOF,
+ *   otherwise 1 in unwrap mode, >= 1 in wrap mode. We stop when reaching the bottom of the
+ *   viewport, it's not counted how many more lines the paragraph would occupy
+ */
+static int
+mcview_display_paragraph (mcview_t * view, mcview_state_machine_t * state, int row)
+{
+    const screen_dimen height = view->data_area.height;
+    int lines = 0;
+    gboolean paragraph_ended;
+    while (1)
+    {
+        lines += mcview_display_line (view, state, row, &paragraph_ended);
+        if (paragraph_ended)
+            return lines;
+        if (row < (int) height)
+        {
+            row++;
+            /* stop if bottom of screen reached */
+            if (row >= (int) height)
+                return lines;
+        }
+    }
+}
+/*
+ * Recompute dpy_state_top from dpy_start and dpy_paragraph_skip_lines. Clamp
+ * dpy_paragraph_skip_lines if necessary.
+ *
+ * This method should be called in wrap mode after changing one of the parsing or formatting
+ * properties (e.g. window width, encoding, nroff), or when switching to wrap mode from unwrap or
+ * hex.
+ *
+ * If we stayed within the same paragraph then try to keep the vertical offset within that
+ * paragraph as well. It might happen though that the paragraph became shorter than our desired
+ * vertical position, in that case move to its last row.
+ */
+static void
+mcview_wrap_fixup (mcview_t * view)
+{
+    mcview_state_machine_t state_prev;
+    gboolean paragraph_ended;
+    int lines = view->dpy_paragraph_skip_lines;
+    if (!view->dpy_wrap_dirty)
+        return;
+    view->dpy_wrap_dirty = FALSE;
+    view->dpy_paragraph_skip_lines = 0;
+    mcview_state_machine_init (&view->dpy_state_top, view->dpy_start);
+    while (lines--)
+    {
+        state_prev = view->dpy_state_top;
+        if (mcview_display_line (view, &view->dpy_state_top, -1, &paragraph_ended) == 0)
+            break;
+        if (paragraph_ended)
+        {
+            view->dpy_state_top = state_prev;
+            break;
+        }
+        view->dpy_paragraph_skip_lines++;
+    }
+}
+/* --------------------------------------------------------------------------------------------- */
+/*** public functions ****************************************************************************/
+/* --------------------------------------------------------------------------------------------- */
+/*
+ * In both wrap and unwrap modes, dpy_start points to the beginning of the paragraph.
+ *
+ * In unwrap mode, start displaying from this position, probably applying an additional horizontal
+ * scroll.
+ *
+ * In wrap mode, an additional dpy_paragraph_skip_lines lines are skipped from the top of this
+ * paragraph. dpy_state_top contains the position and parser-formatter state corresponding to the
+ * top left corner so we can just start rendering from here. Unless dpy_wrap_dirty is set in which
+ * case dpy_state_top is invalid and we need to recompute first.
+ */
+void
+mcview_display_text (mcview_t * view)
+{
+    const screen_dimen left = view->data_area.left;
+    const screen_dimen top = view->data_area.top;
+    const screen_dimen height = view->data_area.height;
+    int row;
+    int n;
+    mcview_state_machine_t state;
+    gboolean again;
+    do
+    {
+        again = FALSE;
+        mcview_display_clean (view);
+        mcview_display_ruler (view);
+        if (view->text_wrap_mode)
+        {
+            mcview_wrap_fixup (view);
+            state = view->dpy_state_top;
+        }
+        else
+        {
+            mcview_state_machine_init (&state, view->dpy_start);
+        }
+        row = 0;
+        while (row < (int) height)
+        {
+            n = mcview_display_paragraph (view, &state, row);
+            if (n == 0)
+            {
+                /* In the rare case that displaying didn't start at the beginning
+                 * of the file, yet there are some empty lines at the bottom,
+                 * scroll the file and display again. This happens when e.g. the
+                 * window is made bigger, or the file becomes shorter due to
+                 * charset change or enabling nroff. */
+                if ((view->text_wrap_mode ? view->dpy_state_top.offset : view->dpy_start) > 0)
+                {
+                    mcview_ascii_move_up (view, height - row);
+                    again = TRUE;
+                }
+                break;
+            }
+            row += n;
+        }
+    }
+    while (again);
+    view->dpy_end = state.offset;
+    view->dpy_state_bottom = state;
+    if (mcview_show_eof != NULL && mcview_show_eof[0] != '\0')
+    {
+        while (row < (int) height)
+        {
+            widget_move (view, top + row, left);
+            /* TODO: should make it no wider than the viewport */
+            tty_print_string (mcview_show_eof);
+            row++;
+        }
+    }
+}
+/*
+ * Move down.
+ *
+ * It's very simple. Just invisibly format the next "lines" lines, carefully carrying the formatter
+ * state in wrap mode. But before each step we need to check if we've already hit the end of the
+ * file, in that case we can no longer move. This is done by walking from dpy_state_bottom.
+ *
+ * Note that this relies on mcview_display_text() setting dpy_state_bottom to its correct value
+ * upon rendering the screen contents. So don't call this function from other functions (e.g. at
+ * the bottom of mcview_ascii_move_up()) which invalidate this value.
+ */
+void
+mcview_ascii_move_down (mcview_t * view, off_t lines)
+{
+    gboolean paragraph_ended;
+    while (lines--)
+    {
+        /* See if there's still data below the bottom line, by imaginarily displaying one
+         * more line. This takes care of reading more data into growbuf, if required.
+         * If the end position didn't advance, we're at EOF and bail out. */
+        if (mcview_display_line (view, &view->dpy_state_bottom, -1, &paragraph_ended) == 0)
+            break;
+        /* Okay, there's enough data. Move by 1 row at the top, too. No need to check for
+         * EOF, that can't happen. */
+        if (!view->text_wrap_mode)
+        {
+            view->dpy_start = mcview_eol (view, view->dpy_start, mcview_get_filesize (view));
+            view->dpy_paragraph_skip_lines = 0;
+            view->dpy_wrap_dirty = TRUE;
+        }
+        else
+        {
+            mcview_display_line (view, &view->dpy_state_top, -1, &paragraph_ended);
+            if (paragraph_ended)
+            {
+                view->dpy_start = view->dpy_state_top.offset;
+                view->dpy_paragraph_skip_lines = 0;
+            }
+            else
+            {
+                view->dpy_paragraph_skip_lines++;
+            }
+        }
+    }
+}
+/*
+ * Move up.
+ *
+ * Unwrap mode: Piece of cake. Wrap mode: If we'd walk back more than the current line offset
+ * within the paragraph, we need to jump back to the previous paragraph and compute its height to
+ * see if we start from that paragraph, and repeat this if necessary. Once we're within the desired
+ * paragraph, we still need to format it from its beginning to know the state.
+ *
+ * See the top of this file for comments about MAX_BACKWARDS_WALK_IN_PARAGRAPH.
+ *
+ * force_max is a nice protection against the rare extreme case that the file underneath us
+ * changes, we don't want to endlessly consume a file of maybe full of zeros upon moving upwards.
+ */
+void
+mcview_ascii_move_up (mcview_t * view, off_t lines)
+{
+    int i;
+    if (!view->text_wrap_mode)
+    {
+        while (lines--)
+            view->dpy_start = mcview_bol (view, view->dpy_start - 1, 0);
+        view->dpy_paragraph_skip_lines = 0;
+        view->dpy_wrap_dirty = TRUE;
+    }
+    else
+    {
+        while (lines > view->dpy_paragraph_skip_lines)
+        {
+            /* We need to go back to the previous paragraph. */
+            if (view->dpy_start == 0)
+            {
+                /* Oops, we're already in the first paragraph. */
+                view->dpy_paragraph_skip_lines = 0;
+                mcview_state_machine_init (&view->dpy_state_top, 0);
+                return;
+            }
+            lines -= view->dpy_paragraph_skip_lines;
+            view->force_max = view->dpy_start;
+            view->dpy_start =
+                mcview_bol (view, view->dpy_start - 1,
+                            view->dpy_start - MAX_BACKWARDS_WALK_IN_PARAGRAPH);
+            mcview_state_machine_init (&view->dpy_state_top, view->dpy_start);
+            /* This is a tricky way of denoting that we're at the end of the paragraph.
+             * Normally we'd jump to the next paragraph and reset paragraph_skip_lines. But for
+             * walking backwards this is exactly what we need. */
+            view->dpy_paragraph_skip_lines =
+                mcview_display_paragraph (view, &view->dpy_state_top, view->data_area.height);
+            view->force_max = -1;
+        }
+        /* Okay, we have have dpy_start pointing to the desired paragraph, and we still need to
+         * walk back "lines" lines from the current "dpy_paragraph_skip_lines" offset. We can't do
+         * that, so walk from the beginning of the paragraph. */
+        mcview_state_machine_init (&view->dpy_state_top, view->dpy_start);
+        view->dpy_paragraph_skip_lines -= lines;
+        for (i = 0; i < view->dpy_paragraph_skip_lines; i++)
+            mcview_display_line (view, &view->dpy_state_top, -1, NULL);
+    }
+}
+/* --------------------------------------------------------------------------------------------- */
+void
+mcview_state_machine_init (mcview_state_machine_t * state, off_t offset)
+{
+    memset (state, 0, sizeof (*state));
+    state->offset = offset;
+    state->print_lonely_combining = TRUE;
+}
+/* --------------------------------------------------------------------------------------------- */

src/viewer/datasource.c

diff --git a/src/viewer/datasource.c b/src/viewer/datasource.c
index 3389ee4..d6da436 100644

a	b	mcview_get_ptr_string (mcview_t view, off_t byte_index)*
164	164	/* --------------------------------------------------------------------------------------------- */
165	165
166	166	int
167		mcview_get_utf (mcview_t * view, off_t byte_index, int ~~char_width~~, gboolean result)
	167	mcview_get_utf (mcview_t * view, off_t byte_index, int bytes_consumed, gboolean result)
168	168	{
169	169	gchar *str = NULL;
170	170	int res = -1;
…	…	mcview_get_utf (mcview_t view, off_t byte_index, int char_width, gboolean r*
172	172	gchar *next_ch = NULL;
173	173	gchar utf8buf[UTF8_CHAR_LEN + 1];
174	174
175		*~~char_width~~ = 0;
	175	*bytes_consumed = 0;
176	176	*result = FALSE;
177	177
178	178	switch (view->datasource)
…	…	mcview_get_utf (mcview_t view, off_t byte_index, int char_width, gboolean r*
218	218	if (res < 0)
219	219	{
220	220	ch = *str;
221		*~~char_width~~ = 1;
	221	*bytes_consumed = 1;
222	222	}
223	223	else
224	224	{
…	…	mcview_get_utf (mcview_t view, off_t byte_index, int char_width, gboolean r*
226	226	/* Calculate UTF-8 char width */
227	227	next_ch = g_utf8_next_char (str);
228	228	if (next_ch)
229		*~~char_width~~ = next_ch - str;
	229	*bytes_consumed = next_ch - str;
230	230	else
231	231	return 0;
232	232	}

src/viewer/display.c

diff --git a/src/viewer/display.c b/src/viewer/display.c
index 00c6ec0..b1bd390 100644

a	b	mcview_display (mcview_t view)*
251	251	{
252	252	mcview_display_hex (view);
253	253	}
254		~~else if (view->text_nroff_mode)~~
255		{
256		~~mcview_display_nroff (view);~~
257		}
258	254	else
259	255	{
260	256	mcview_display_text (view);

src/viewer/internal.h

diff --git a/src/viewer/internal.h b/src/viewer/internal.h
index 9562c52..fc665db 100644

  typedef struct
     coord_cache_entry_t **cache;
 } coord_cache_t;
+/* TODO: find a better name. This is not actually a "state machine",
+ * but a "state machine's state", but that sounds silly.
+ * Could be parser_state, formatter_state... */
+typedef struct
+{
+    off_t offset;               /* The file offset at which this is the state. */
+    off_t unwrapped_column;     /* Columns if the paragraph wasn't wrapped, */
+    /* used for positioning TABs in wrapped lines */
+    gboolean nroff_underscore_is_underlined;    /* whether _\b_ is underlined rather than bold */
+    gboolean print_lonely_combining;    /* whether lonely combining marks are printed on a dotted circle */
+} mcview_state_machine_t;
 struct mcview_nroff_struct;
 struct mcview_struct
-…
+ struct mcview_struct
     /* Display information */
     gboolean active;            /* Active or not in QuickView mode */
     screen_dimen dpy_frame_size;        /* Size of the frame surrounding the real viewer */
     off_t dpy_start;            /* Offset of the displayed data */
+    off_t dpy_start;            /* Offset of the displayed data (start of the paragraph in non-hex mode) */
     off_t dpy_end;              /* Offset after the displayed data */
+    off_t dpy_paragraph_skip_lines;     /* Extra lines to skip in wrap mode */
+    mcview_state_machine_t dpy_state_top;       /* Parser-formatter state at the topmost visible line in wrap mode */
+    mcview_state_machine_t dpy_state_bottom;    /* Parser-formatter state after the bottomvisible line in wrap mode */
+    gboolean dpy_wrap_dirty;    /* dpy_state_top needs to be recomputed */
     off_t dpy_text_column;      /* Number of skipped columns in non-wrap
                                  * text mode */
     off_t hex_cursor;           /* Hexview cursor position in file */
-…
+ struct mcview_struct
     struct area ruler_area;     /* Where the ruler is displayed */
     struct area data_area;      /* Where the data is displayed */
+    ssize_t force_max;          /* Force a max offset, or -1 */
     int dirty;                  /* Number of skipped updates */
     gboolean dpy_bbar_dirty;    /* Does the button bar need to be updated? */
-…
+ cb_ret_t mcview_callback (Widget * w, Widget * sender, widget_msg_t msg, int par
 cb_ret_t mcview_dialog_callback (Widget * w, Widget * sender, widget_msg_t msg, int parm,
                                  void *data);
+/* ascii.c: */
+void mcview_display_text (mcview_t *);
+void mcview_state_machine_init (mcview_state_machine_t *, off_t);
+void mcview_ascii_move_down (mcview_t *, off_t);
+void mcview_ascii_move_up (mcview_t *, off_t);
 /* coord_cache.c: */
 coord_cache_t *coord_cache_new (void);
 void coord_cache_free (coord_cache_t * cache);
-…
+ void mcview_place_cursor (mcview_t *);
 void mcview_moveto_match (mcview_t *);
 /* nroff.c: */
-void mcview_display_nroff (mcview_t * view);
 int mcview__get_nroff_real_len (mcview_t * view, off_t, off_t p);
 mcview_nroff_t *mcview_nroff_seq_new_num (mcview_t * view, off_t p);
 mcview_nroff_t *mcview_nroff_seq_new (mcview_t * view);
 void mcview_nroff_seq_free (mcview_nroff_t **);
-…
+ nroff_type_t mcview_nroff_seq_info (mcview_nroff_t *);
 int mcview_nroff_seq_next (mcview_nroff_t *);
 int mcview_nroff_seq_prev (mcview_nroff_t *);
-/* plain.c: */
-void mcview_display_text (mcview_t *);
 /* search.c: */
 mc_search_cbret_t mcview_search_cmd_callback (const void *user_data, gsize char_offset,
                                               int *current_char);

src/viewer/lib.c

diff --git a/src/viewer/lib.c b/src/viewer/lib.c
index 6d51206..a5ab76d 100644

a	b	mcview_toggle_magic_mode (mcview_t view)*
106	106	void
107	107	mcview_toggle_wrap_mode (mcview_t * view)
108	108	{
109		~~if (view->text_wrap_mode)~~
110		~~view->dpy_start = mcview_bol (view, view->dpy_start, 0);~~
111	109	view->text_wrap_mode = !view->text_wrap_mode;
	110	view->dpy_wrap_dirty = TRUE;
112	111	view->dpy_bbar_dirty = TRUE;
113	112	view->dirty++;
114	113	}
…	…	mcview_toggle_nroff_mode (mcview_t view)*
120	119	{
121	120	view->text_nroff_mode = !view->text_nroff_mode;
122	121	mcview_altered_nroff_flag = 1;
	122	view->dpy_wrap_dirty = TRUE;
123	123	view->dpy_bbar_dirty = TRUE;
124	124	view->dirty++;
125	125	}
…	…	mcview_toggle_hex_mode (mcview_t view)*
144	144	widget_want_cursor (WIDGET (view), FALSE);
145	145	}
146	146	mcview_altered_hex_mode = 1;
	147	view->dpy_paragraph_skip_lines = 0;
	148	view->dpy_wrap_dirty = TRUE;
147	149	view->dpy_bbar_dirty = TRUE;
148	150	view->dirty++;
149	151	}
…	…	mcview_init (mcview_t view)*
170	172	view->coord_cache = NULL;
171	173
172	174	view->dpy_start = 0;
	175	view->dpy_paragraph_skip_lines = 0;
	176	mcview_state_machine_init (&view->dpy_state_top, 0);
	177	view->dpy_wrap_dirty = FALSE;
	178	view->force_max = -1;
173	179	view->dpy_text_column = 0;
174	180	view->dpy_end = 0;
175	181	view->hex_cursor = 0;
…	…	mcview_set_codeset (mcview_t view)*
282	288	view->converter = conv;
283	289	}
284	290	view->utf8 = (gboolean) str_isutf8 (cp_id);
	291	view->dpy_wrap_dirty = TRUE;
285	292	}
286	293	#else
287	294	(void) view;
…	…	mcview_bol (mcview_t view, off_t current, off_t limit)*
339	346	if (c == '\r')
340	347	current--;
341	348	}
342		while (current > 0 && current >= limit)
	349	while (current > 0 && current > limit)
343	350	{
344	351	if (!mcview_get_byte (view, current - 1, &c))
345	352	break;

src/viewer/mcviewer.c

diff --git a/src/viewer/mcviewer.c b/src/viewer/mcviewer.c
index eb8ec73..dafbc21 100644

  mcview_load (mcview_t * view, const char *command, const char *file, int start_l
   finish:
     view->command = g_strdup (command);
     view->dpy_start = 0;
+    view->dpy_paragraph_skip_lines = 0;
+    mcview_state_machine_init (&view->dpy_state_top, 0);
+    view->dpy_wrap_dirty = FALSE;
+    view->force_max = -1;
     view->search_start = 0;
     view->search_end = 0;
     view->dpy_text_column = 0;
-…
+ mcview_load (mcview_t * view, const char *command, const char *file, int start_l
         else
             new_offset = min (new_offset, max_offset);
         if (!view->hex_mode)
+        {
             view->dpy_start = mcview_bol (view, new_offset, 0);
+            view->dpy_wrap_dirty = TRUE;
+        }
         else
+        {
             view->dpy_start = new_offset - new_offset % view->bytes_per_line;

src/viewer/move.c

diff --git a/src/viewer/move.c b/src/viewer/move.c
index 7cd852b..b7938ac 100644

a	b	mcview_scroll_to_cursor (mcview_t view)*
83	83	if (cursor < topleft)
84	84	topleft = mcview_offset_rounddown (cursor, bytes);
85	85	view->dpy_start = topleft;
	86	view->dpy_paragraph_skip_lines = 0;
	87	view->dpy_wrap_dirty = TRUE;
86	88	}
87	89	}
88	90
…	…	mcview_movement_fixups (mcview_t view, gboolean reset_search)*
107	109	void
108	110	mcview_move_up (mcview_t * view, off_t lines)
109	111	{
110		~~off_t new_offset;~~
111
112	112	if (view->hex_mode)
113	113	{
114	114	off_t bytes = lines * view->bytes_per_line;
…	…	mcview_move_up (mcview_t view, off_t lines)*
116	116	{
117	117	view->hex_cursor -= bytes;
118	118	if (view->hex_cursor < view->dpy_start)
	119	{
119	120	view->dpy_start = mcview_offset_doz (view->dpy_start, bytes);
	121	view->dpy_paragraph_skip_lines = 0;
	122	view->dpy_wrap_dirty = TRUE;
	123	}
120	124	}
121	125	else
122	126	{
…	…	mcview_move_up (mcview_t view, off_t lines)*
125	129	}
126	130	else
127	131	{
128		off_t i;
129
130		for (i = 0; i < lines; i++)
131		{
132		if (view->dpy_start == 0)
133		break;
134		if (view->text_wrap_mode)
135		{
136		new_offset = mcview_bol (view, view->dpy_start, view->dpy_start - (off_t) 1);
137		/* check if dpy_start == BOL or not (then new_offset = dpy_start - 1,
138		* no need to check more) */
139		if (new_offset == view->dpy_start)
140		{
141		size_t last_row_length;
142
143		new_offset = mcview_bol (view, new_offset - 1, 0);
144		last_row_length = (view->dpy_start - new_offset) % view->data_area.width;
145		if (last_row_length != 0)
146		{
147		/* if dpy_start == BOL in wrapped mode, find BOL of previous line
148		* and move down all but the last rows */
149		new_offset = view->dpy_start - (off_t) last_row_length;
150		}
151		}
152		else
153		{
154		/* if dpy_start != BOL in wrapped mode, just move one row up;
155		* no need to check if > 0 as there is at least exactly one wrap
156		* between dpy_start and BOL */
157		new_offset = view->dpy_start - (off_t) view->data_area.width;
158		}
159		view->dpy_start = new_offset;
160		}
161		else
162		{
163		/* if unwrapped -> current BOL equals dpy_start, just find BOL of previous line */
164		new_offset = view->dpy_start - 1;
165		view->dpy_start = mcview_bol (view, new_offset, 0);
166		}
167		}
	132	mcview_ascii_move_up (view, lines);
168	133	}
169	134	mcview_movement_fixups (view, TRUE);
170	135	}
…	…	mcview_move_down (mcview_t view, off_t lines)*
188	153	{
189	154	view->hex_cursor += view->bytes_per_line;
190	155	if (lines != 1)
	156	{
191	157	view->dpy_start += view->bytes_per_line;
	158	view->dpy_paragraph_skip_lines = 0;
	159	view->dpy_wrap_dirty = TRUE;
	160	}
192	161	}
193	162	}
194	163	else
195	164	{
196		off_t new_offset = 0;
197
198		if (view->dpy_end - view->dpy_start > last_byte - view->dpy_end)
199		{
200		while (lines-- > 0)
201		{
202		if (view->text_wrap_mode)
203		view->dpy_end =
204		mcview_eol (view, view->dpy_end,
205		view->dpy_end + (off_t) view->data_area.width);
206		else
207		view->dpy_end = mcview_eol (view, view->dpy_end, last_byte);
208
209		if (view->text_wrap_mode)
210		new_offset =
211		mcview_eol (view, view->dpy_start,
212		view->dpy_start + (off_t) view->data_area.width);
213		else
214		new_offset = mcview_eol (view, view->dpy_start, last_byte);
215		if (new_offset < last_byte)
216		view->dpy_start = new_offset;
217		if (view->dpy_end >= last_byte)
218		break;
219		}
220		}
221		else
222		{
223		off_t i;
224		for (i = 0; i < lines && new_offset < last_byte; i++)
225		{
226		if (view->text_wrap_mode)
227		new_offset =
228		mcview_eol (view, view->dpy_start,
229		view->dpy_start + (off_t) view->data_area.width);
230		else
231		new_offset = mcview_eol (view, view->dpy_start, last_byte);
232		if (new_offset < last_byte)
233		view->dpy_start = new_offset;
234		}
235		}
	165	mcview_ascii_move_down (view, lines);
236	166	}
237	167	mcview_movement_fixups (view, TRUE);
238	168	}
…	…	mcview_move_left (mcview_t view, off_t columns)*
257	187	if (old_cursor > 0 \|\| view->hexedit_lownibble)
258	188	view->hexedit_lownibble = !view->hexedit_lownibble;
259	189	}
260		else
	190	else if (!view->text_wrap_mode)
261	191	{
262	192	if (view->dpy_text_column >= columns)
263	193	view->dpy_text_column -= columns;
…	…	mcview_move_right (mcview_t view, off_t columns)*
289	219	if (old_cursor < last_byte \|\| !view->hexedit_lownibble)
290	220	view->hexedit_lownibble = !view->hexedit_lownibble;
291	221	}
292		else
	222	else if (!view->text_wrap_mode)
293	223	{
294	224	view->dpy_text_column += columns;
295	225	}
…	…	void
302	232	mcview_moveto_top (mcview_t * view)
303	233	{
304	234	view->dpy_start = 0;
	235	view->dpy_paragraph_skip_lines = 0;
	236	mcview_state_machine_init (&view->dpy_state_top, 0);
305	237	view->hex_cursor = 0;
306	238	view->dpy_text_column = 0;
307	239	mcview_movement_fixups (view, TRUE);
…	…	mcview_moveto_bottom (mcview_t view)*
331	263	const off_t datalines = view->data_area.height;
332	264
333	265	view->dpy_start = filesize;
	266	view->dpy_paragraph_skip_lines = 0;
	267	view->dpy_wrap_dirty = TRUE;
334	268	mcview_move_up (view, datalines);
335	269	}
336	270	}
…	…	mcview_moveto_bol (mcview_t view)*
347	281	else if (!view->text_wrap_mode)
348	282	{
349	283	view->dpy_start = mcview_bol (view, view->dpy_start, 0);
	284	view->dpy_paragraph_skip_lines = 0;
	285	view->dpy_wrap_dirty = TRUE;
350	286	}
351	287	view->dpy_text_column = 0;
352	288	mcview_movement_fixups (view, TRUE);
…	…	mcview_moveto_offset (mcview_t view, off_t offset)*
424	360	{
425	361	view->hex_cursor = offset;
426	362	view->dpy_start = offset - offset % view->bytes_per_line;
	363	view->dpy_paragraph_skip_lines = 0;
	364	view->dpy_wrap_dirty = TRUE;
427	365	}
428	366	else
429	367	{
430	368	view->dpy_start = offset;
	369	view->dpy_paragraph_skip_lines = 0;
	370	view->dpy_wrap_dirty = TRUE;
431	371	}
432	372	mcview_movement_fixups (view, TRUE);
433	373	}
…	…	mcview_moveto_match (mcview_t view)*
498	438	view->hexedit_lownibble = FALSE;
499	439	view->dpy_start = view->search_start - view->search_start % view->bytes_per_line;
500	440	view->dpy_end = view->search_end - view->search_end % view->bytes_per_line;
	441	view->dpy_paragraph_skip_lines = 0;
	442	view->dpy_wrap_dirty = TRUE;
501	443	}
502	444	else
	445	{
503	446	view->dpy_start = mcview_bol (view, view->search_start, 0);
	447	view->dpy_paragraph_skip_lines = 0;
	448	view->dpy_wrap_dirty = TRUE;
	449	}
504	450
505	451	mcview_scroll_to_cursor (view);
506	452	view->dirty++;

src/viewer/nroff.c

diff --git a/src/viewer/nroff.c b/src/viewer/nroff.c
index 6d6c97b..e1f5010 100644

-                      a
 /*
    Internal file viewer for the Midnight Commander
    Function for nroff-like view
+   Functions for searching in nroff-like view
    Copyright (C) 1994-2014
    Free Software Foundation, Inc.
-…
+ mcview_nroff_get_char (mcview_nroff_t * nroff, int *ret_val, off_t nroff_index)
 /*** public functions ****************************************************************************/
 /* --------------------------------------------------------------------------------------------- */
-void
-mcview_display_nroff (mcview_t * view)
+{
-    const screen_dimen left = view->data_area.left;
-    const screen_dimen top = view->data_area.top;
-    const screen_dimen width = view->data_area.width;
-    const screen_dimen height = view->data_area.height;
-    screen_dimen row, col;
-    off_t from;
-    int cw = 1;
-    int c;
-    int c_prev = 0;
-    int c_next = 0;
-    mcview_display_clean (view);
-    mcview_display_ruler (view);
-    /* Find the first displayable changed byte */
-    from = view->dpy_start;
-    tty_setcolor (VIEW_NORMAL_COLOR);
-    for (row = 0, col = 0; row < height;)
+    {
-#ifdef HAVE_CHARSET
-        if (view->utf8)
+        {
-            gboolean read_res = TRUE;
-            c = mcview_get_utf (view, from, &cw, &read_res);
-            if (!read_res)
-                break;
+        }
-        else
-#endif
+        {
-            if (!mcview_get_byte (view, from, &c))
-                break;
+        }
-        from++;
-        if (cw > 1)
-            from += cw - 1;
-        if (c == '\b')
+        {
-            if (from > 1)
+            {
-#ifdef HAVE_CHARSET
-                if (view->utf8)
+                {
-                    gboolean read_res;
-                    c_next = mcview_get_utf (view, from, &cw, &read_res);
+                }
-                else
-#endif
-                    mcview_get_byte (view, from, &c_next);
+            }
-            if (g_unichar_isprint (c_prev) && g_unichar_isprint (c_next)
-                && (c_prev == c_next || c_prev == '_' || (c_prev == '+' && c_next == 'o')))
+            {
-                if (col == 0)
+                {
-                    if (row == 0)
+                    {
-                        /* We're inside an nroff character sequence at the
-                         * beginning of the screen -- just skip the
-                         * backspace and continue with the next character. */
-                        continue;
+                    }
-                    row--;
-                    col = width;
+                }
-                col--;
-                if (c_prev == '_'
-                    && (c_next != '_' || mcview_count_backspaces (view, from + 1) == 1))
-                    tty_setcolor (VIEW_UNDERLINED_COLOR);
-                else
-                    tty_setcolor (VIEW_BOLD_COLOR);
-                continue;
+            }
+        }
-        if ((c == '\n') || (col >= width && view->text_wrap_mode))
+        {
-            col = 0;
-            row++;
-            if (c == '\n' || row >= height)
-                continue;
+        }
-        if (c == '\r')
+        {
-            mcview_get_byte_indexed (view, from, 1, &c);
-            if (c == '\r' || c == '\n')
-                continue;
-            col = 0;
-            row++;
-            continue;
+        }
-        if (c == '\t')
+        {
-            off_t line, column;
-            mcview_offset_to_coord (view, &line, &column, from);
-            col += (option_tab_spacing - col % option_tab_spacing);
-            if (view->text_wrap_mode && col >= width && width != 0)
+            {
-                row += col / width;
-                col %= width;
+            }
-            continue;
+        }
-        if (view->search_start <= from && from < view->search_end)
+        {
-            tty_setcolor (SELECTED_COLOR);
+        }
-        c_prev = c;
-        if ((off_t) col >= view->dpy_text_column
-            && (off_t) col - view->dpy_text_column < (off_t) width)
+        {
-            widget_move (view, top + row, left + ((off_t) col - view->dpy_text_column));
-#ifdef HAVE_CHARSET
-            if (mc_global.utf8_display)
+            {
-                if (!view->utf8)
+                {
-                    c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
+                }
-                if (!g_unichar_isprint (c))
-                    c = '.';
+            }
-            else if (view->utf8)
-                c = convert_from_utf_to_current_c (c, view->converter);
-            else
-                c = convert_to_display_c (c);
-#endif
-            tty_print_anychar (c);
+        }
-        col++;
-#ifdef HAVE_CHARSET
-        if (view->utf8)
+        {
-            if (g_unichar_iswide (c))
-                col++;
-            else if (g_unichar_iszerowidth (c))
-                col--;
+        }
-#endif
-        tty_setcolor (VIEW_NORMAL_COLOR);
+    }
-    view->dpy_end = from;
+}
-/* --------------------------------------------------------------------------------------------- */
 int
 mcview__get_nroff_real_len (mcview_t * view, off_t start, off_t length)
+{

deleted file src/viewer/plain.c

diff --git a/src/viewer/plain.c b/src/viewer/plain.c
deleted file mode 100644
index 11e65d4..0000000

-                      +
-/*
-   Internal file viewer for the Midnight Commander
-   Function for plain view
-   Copyright (C) 1994-2014
-   Free Software Foundation, Inc.
-   Written by:
-   Miguel de Icaza, 1994, 1995, 1998
-   Janne Kukonlehto, 1994, 1995
-   Jakub Jelinek, 1995
-   Joseph M. Hinkle, 1996
-   Norbert Warmuth, 1997
-   Pavel Machek, 1998
-   Roland Illig <roland.illig@gmx.de>, 2004, 2005
-   Slava Zanko <slavazanko@google.com>, 2009
-   Andrew Borodin <aborodin@vmail.ru>, 2009-2014
-   Ilia Maslakov <il.smind@gmail.com>, 2009
-   This file is part of the Midnight Commander.
-   The Midnight Commander is free software: you can redistribute it
-   and/or modify it under the terms of the GNU General Public License as
-   published by the Free Software Foundation, either version 3 of the License,
-   or (at your option) any later version.
-   The Midnight Commander is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-#include <config.h>
-#include "lib/global.h"
-#include "lib/tty/tty.h"
-#include "lib/skin.h"
-#include "lib/util.h"           /* is_printable() */
-#ifdef HAVE_CHARSET
-#include "lib/charsets.h"
-#endif
-#include "src/setup.h"          /* option_tab_spacing */
-#include "internal.h"
-/*** global variables ****************************************************************************/
-/*** file scope macro definitions ****************************************************************/
-/*** file scope type declarations ****************************************************************/
-/*** file scope variables ************************************************************************/
-/*** file scope functions ************************************************************************/
-/* --------------------------------------------------------------------------------------------- */
-/* --------------------------------------------------------------------------------------------- */
-/*** public functions ****************************************************************************/
-/* --------------------------------------------------------------------------------------------- */
-void
-mcview_display_text (mcview_t * view)
+{
-    const screen_dimen left = view->data_area.left;
-    const screen_dimen top = view->data_area.top;
-    const screen_dimen width = view->data_area.width;
-    const screen_dimen height = view->data_area.height;
-    screen_dimen row = 0, col = 0;
-    off_t from;
-    int cw = 1;
-    int c, prev_ch = 0;
-    gboolean last_row = TRUE;
-    mcview_display_clean (view);
-    mcview_display_ruler (view);
-    /* Find the first displayable changed byte */
-    from = view->dpy_start;
-    while (row < height)
+    {
-#ifdef HAVE_CHARSET
-        if (view->utf8)
+        {
-            gboolean read_res = TRUE;
-            c = mcview_get_utf (view, from, &cw, &read_res);
-            if (!read_res)
-                break;
+        }
-        else
-#endif
-        if (!mcview_get_byte (view, from, &c))
-            break;
-        last_row = FALSE;
-        from++;
-        if (cw > 1)
-            from += cw - 1;
-        if (c != '\n' && prev_ch == '\r')
+        {
-            if (++row >= height)
-                break;
-            col = 0;
-            /* tty_print_anychar ('\n'); */
+        }
-        prev_ch = c;
-        if (c == '\r')
-            continue;
-        if (c == '\n')
+        {
-            col = 0;
-            row++;
-            continue;
+        }
-        if (col >= width && view->text_wrap_mode)
+        {
-            col = 0;
-            if (++row >= height)
-                break;
+        }
-        if (c == '\t')
+        {
-            col += (option_tab_spacing - col % option_tab_spacing);
-            if (view->text_wrap_mode && col >= width && width != 0)
+            {
-                row += col / width;
-                col %= width;
+            }
-            continue;
+        }
-        if (view->search_start <= from && from < view->search_end)
-            tty_setcolor (SELECTED_COLOR);
-        else
-            tty_setcolor (VIEW_NORMAL_COLOR);
-        if (((off_t) col >= view->dpy_text_column)
-            && ((off_t) col - view->dpy_text_column < (off_t) width))
+        {
-            widget_move (view, top + row, left + ((off_t) col - view->dpy_text_column));
-#ifdef HAVE_CHARSET
-            if (mc_global.utf8_display)
+            {
-                if (!view->utf8)
-                    c = convert_from_8bit_to_utf_c ((unsigned char) c, view->converter);
-                if (!g_unichar_isprint (c))
-                    c = '.';
+            }
-            else if (view->utf8)
-                c = convert_from_utf_to_current_c (c, view->converter);
-            else
+            {
-                c = convert_to_display_c (c);
-                if (!is_printable (c))
-                    c = '.';
+            }
-#else /* HAVE_CHARSET */
-            if (!is_printable (c))
-                c = '.';
-#endif /* HAVE_CHARSET */
-            tty_print_anychar (c);
+        }
-        col++;
-#ifdef HAVE_CHARSET
-        if (view->utf8)
+        {
-            if (g_unichar_iswide (c))
-                col++;
-            else if (g_unichar_iszerowidth (c))
-                col--;
+        }
-#endif
+    }
-    view->dpy_end = from;
-    if (mcview_show_eof != NULL && mcview_show_eof[0] != '\0')
+    {
-        if (last_row && mcview_get_byte (view, from - 1, &c) && c != '\n')
-            row--;
-        while (++row < height)
+        {
-            widget_move (view, top + row, left);
-            tty_print_string (mcview_show_eof);
+        }
+    }
+}
-/* --------------------------------------------------------------------------------------------- */

new file tests/src/viewer/viewertest.txt

diff --git a/tests/src/viewer/viewertest.txt b/tests/src/viewer/viewertest.txt
new file mode 100644
index 0000000000000000000000000000000000000000..add62846d0097181ccfd117df40719441a85be28
GIT binary patch
literal 4680
zcmdTIYjYe&k?`DG5D?yPI-iijNtTl6VHH*2mn>Vdlh|^U@Y>qlo4eb1Z}+mZr#q##
zirte5U?~a2A+||K_#hH0RQZcB4iuEC_VfIR6pI}5FX*1xTiq!#6_@pe(Oyq?PxtIh
zPd{e$AS}*-$-#9z1|bU?j%S8`pfA-O4uRi>Iu1@B1lxC84Uf+Pz{BGaLTZl`*kplC
z71(qSn?4u~&NnlDPxHad`0fcO?rlPtnTT}#K{#?|4_gfFvn>Ya+Xq|hns6U%F*HQB
z`0k#rF*JCV_wF0JCfpB$3@&kIKP;jTp?@qLIt+H*3``qCg@a#(6*E|Myc)EfN?7MG
z+;+k`1gy#e=GhD}ufhTnFfD(Djg094%XQ{KJ20C~Rw+YR$G8hj2LrGMmWi{r9oGfV
z55aWX<|@a|sd@TL-qM?cf765H)Kot`IXSsQK7y@BJxomz56MROglg%{sCZjxx->Oa
z!f8sRU6N`grBTYoODb~whUIv2qqfbw5RKzO=r_qmY=s4@z?toP6>wRIwu51_40ajD
z#>PhU(YtSven=#TkBIo_F_Fw27xDZFk(^u*@!~0woPJcqk3BAuGiOD-bWS8sEQ|Qb
zr$qAXs)##nk*u^t9CDEa&xp9`i^OY)*gY?jryUX3Ya*#K5m#)HSf+^O=1%zm0}~hD
zAjIBlt0r*Vyo;kf_n>V7+yO-T_^b&qF^jJNc!)ZYU}{x}=wfc|p<)eSa)%HXNVh|}
z9i@w%(v^tP%~7RW5Tr|4?tL3DL$FxQ@#F#GiBUBu%N?6lI<3&O%IKG29o=^?=N@LA
zrt8>F2o)cXw*s~k*e>I04&F3v#z#~L=x7w#?$}5Azb<d6JxRg4rXroROM3vh{j+9+
z$;|*Qj9j*12IuvEzZ0v4{cl9B?%hAGD_LCI2gD;C1xver>Nrj=lwlD)s{u}vw;Eur
z!g1dxXTrqban}s-FnZxbb(+^5JeCRxqTi8q?=r7YLdlYAS$8vAyPS1zWNUxezVcj_
zJfFo0^547sYL;we$)C6X{#Mpq&yo$~-@4kpb@9r37yo+eI?8NpU;NAM^*6KRPd(v{
z?SK9QRkGwlmb{oHn^}U2*S0s`qG^-2w_kZJORi_#mu|nk^`GSBEV(LYzJxQ-{D~Cn
zuHU|KGfOVXp=-p9yAbK{|FTwjo=4e=>6tY%a0VO<gJD;3X!_g<9pB?)aI(cQ-&5|#
z@^J(m0s~D}hPDqjb6t*S?zh|uP}MMl0~LoW40CLap=2Qams~ghMbeE2fMQ7j*nv4W
zXY_UGv73J*@1|~ZOLhen<zv0U-e^u2!^L<Z@IHwhksX<y5T&c7y6M1KEV^*0fk_op
z>V7?uZod1@b+8E71S$j=fhvI-fjR+)z|#cI6L1MM2zUg10!;$X5C{lx0wIAGffWL6
z0v!UY1fE48JqE8892!msLs~k3PX~%93ow)R>IqkM67x<kbIQY|JcLN`sJujlM`6lZ
zfKg;xc7-AHzFY)@s??tWO0o!9GHni=ysGfWIw7VgOg`9gj^7GAoW%<ZXT6dxm)>Z(
zq0_|FHEfR*WyZ&P&qw-_t=x+&j#T;;Qjo;Tc@q7=M5NoQCfd{p6^ZDIHE{ue&vs;@
z=F}vPWbxoQ=TfI&DP;iUN)1I9!>D1?pPWj>$QKC#tlW~1@95{^&`#~aIG4A<GFYv!
z2zg~oI&M5@d9-TxIf`9j6vC+$CR2>NBFR|j0d$2RYvn}0!Eh9pElq<aQxr0ar!ZRo
z&@QvdH_)X0TFP)rtKUe4>7*8F<R(Q-8t-)ASF0)tlxdU=aP!53CdmI}B13-BV?^v~
zk#5XmZuQH?vLCpW2Mib}Gyv*WCL;oD+8I<N?B%Jw5+i_NScYM=@R#@a9?h87=9$J^
z&Be#p{IXU?rlBnxnr~=<p}B@uF|-GeDOaJP4KO%0{23Y+X$0LHxFQYLq|qWYT#cpS
z+6v0hYy%1WB_JouaaygW0$M{;7Ubp@a<@<v#}MF14@EBGxH<#hpb!aL1->U=?o)}t
z_eOga%VJ>pRJm%V<5508;jQ2&O$9tUPW%u+$G6YRTMSrq)9j;(YS9-Ix9R4+Bj~ZV
zyrbUXKI-7N2Yy#oodC-!{ug6JH^(fQ_Qvw$XN>8lW|g#(p{Xme7Y4rDmu+Eqe0+3b
z4p_%#&5+i0z^xvzgECCdmGM6iqX*|k@K%;rW&C2peFm)Q;$=);%<v-2^&cEMeB|h{
zx#RPrqYI0pryqUn@iS+a&ONdG<WoQV(T{)fQzQC*REV57xv+R@4-x%C_kH*yAN|<J
zKcNA9@>B5X&wTd&&wc(2U;NUSzw*_uef=BX{MNUlpG9e$CTTZaOE089PoGPFk^VCM
zReCX9PdC!%(-+bg(_g2*Nnc7Yr7x$yO<zebr&rRe>F?54)7R49r?01P$nff&%ekew
l(HZ#xN;w{PgT)9hpe@Ywr|_cO;;_)Fd*8t;$xbEE{|kv)4EF#4

Download in other formats:

Original Format