Ticket #3830 (new task)
mcedit: create a corpus of sample files in various syntaxes for testing purposes
Reported by: | mooffie | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | Future Releases |
Component: | mcedit | Version: | master |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Branch state: | no branch | Votes for changeset: |
Description (last modified by mooffie) (diff)
(Henceforth, "syntax" == "syntax highlighting".)
Our editor has many syntax definitions (*.syntax files).
If we ever fix things on the C side of mcedit, or modify a syntax definition file, we'll have a problem: since we don't have a collection of sample files, in the various syntaxes, to test our fixes against, we (the maintainers) would have to create these sample files ourselves. And we'd have to create good files: such that demonstrate every nook and cranny in the syntax definitions.
This is a lot of work, so I suggest we start small: have just one or two sample files for now, close this ticket, and add more sample files as time goes by.
To alleviate this burden we ought to make a rule:
Any new syntax definition must be contributed together with a sample file(s). The people writing the syntax files know their language best, so they're the ones who should provide the samples.
Change History
comment:3 follow-up: ↓ 4 Changed 7 years ago by zaytsev
I have to say that I'm always on the side of more tests, but in this particular case I can't help asking if we have a larger problem.
What I mean by that is the last time I had a look at the syntax highlighter code, I've almost got a heart attack. It's compact and ingenious, and it's been actually working for a very long time, but it's all but easily understandable, well documented and properly tested. On top of that it has some genetic deficiencies, like the nested quoting bug. We also ended up having a whole library of highlighting rules, which as you correctly mention are not tested, but also not really maintained.
I've been thinking about it for quite awhile and my thoughts are that we aren't the first project attacking this problem, and there are tons of libraries for that purpose. To name only few I personally used in the past:
- http://pygments.org/
- https://wiki.gnome.org/Projects/GtkSourceView
- https://github.com/colorer/Colorer-library
Of course, there are good arguments against introducing a dependency on a syntax highlighting library, but maybe there is some middle ground like implement a minimalist engine and automatically generate syntax files from e.g. Pygments collection...
Just thought I'd raise the point before you invest substantial amount of time in testing of the existing highlighter, even though a test corpus would be useful irrespectively of whether highlighter will gets replaced or not.
comment:5 follow-up: ↓ 6 Changed 7 years ago by zaytsev
... not that I'm a huge fan of colorer myself.
It's not very well maintained (it seems that Igor gave up on it a long time ago), there are not so much syntax definitions available, they are mostly not up to date, the definition syntax is blood chilling, and the engine is written in C++ with a dependency on Apache Xerces.
I would really rather look in the direction of GtkSourceView and/or Scintilla.
comment:6 in reply to: ↑ 5 Changed 7 years ago by andrew_b
Replying to zaytsev:
... not that I'm a huge fan of colorer myself.
It's not very well maintained (it seems that Igor gave up on it a long time ago), there are not so much syntax definitions available, they are mostly not up to date, the definition syntax is blood chilling, and the engine is written in C++ with a dependency on Apache Xerces.
So can we close #2931 as wontfix?
comment:7 Changed 7 years ago by zaytsev
I don't know, I personally would rather re-purpose it to integrate an alternative syntax highlighter without specifically naming colorer.
comment:8 Changed 7 years ago by zaytsev
So the cool kids get cool libraries, and all we get is crap:
:-(
comment:9 follow-up: ↓ 10 Changed 7 years ago by teresaejunior
What about this one? http://www.andre-simon.de/doku/highlight/en/changelog.php
comment:10 in reply to: ↑ 9 Changed 7 years ago by andrew_b
Replying to teresaejunior:
What about this one? http://www.andre-simon.de/doku/highlight/en/changelog.php
This is C++.
comment:11 Changed 7 years ago by zaytsev
It being C++ isn't even the worst part of it :-/ For once, I couldn't find any embedding documentation / API for that one, and it doesn't seem to support incremental highlighting, etc. Apparently it's really geared towards whole-file colorization and thus I don't think it's suitable for integration with the editor, at best, one could try to use it for the viewer to generate colorized version using ANSI output...
comment:12 Changed 7 years ago by ossi
so it seems that re-implementing the kate highlighting engine (now KSyntaxHighlighting) is popular - apart from the haskell-based skylighting mentioned above (and its predecessor highlighting-kate), qt creator also did it (with c++ again), as did Syntax::Highlight::Engine::Kate in perl. with so much code around to rip off from be inspired by, it would be a shame not to re-implement it again, this time in plain c. :D
Some random notes: