⚓ T294760 Many raw HTML messages in WikiEditor cannot be edited in translatewiki
Page Menu
Phabricator
Create Task
Maniphest
T294760
Many raw HTML messages in WikiEditor cannot be edited in translatewiki
Closed, Resolved
Public
Actions
Edit Task
Edit Related Tasks...
Create Subtask
Edit Parent Tasks
Edit Subtasks
Merge Duplicates In
Close As Duplicate
Edit Related Objects...
Edit Commits
Edit Mocks
Mute Notifications
Protect as security issue
Assigned To
jhsoby
Authored By
Amire80
Nov 1 2021, 1:17 PM
2021-11-01 13:17:04 (UTC+0)
Tags
translatewiki.net
(External)
WikiEditor (2010)
(Closed)
Referenced Files
None
Subscribers
Aklapper
Amire80
matmarex
Tacsipacsi
Verdy_p
Yupik
Description
The WikiEditor extension is among the most important ones for Wikimedia sites' contributors. Its localization, however, is hindered by the fact that it has several dozens of raw HTML messages. To see a list, go to
and search for RawHtmlMessages.
These messages cannot be translated through the usual translatewiki interface for security reasons, and require administrator intervention.
It would be nice to get rid of this problem and allow usual translatewiki contributors to translate them.
Details
Related Changes in Gerrit:
Subject
Repo
Branch
Lines +/-
Stop using autoMsg and use mw.messages directly instead
mediawiki/extensions/WikiEditor
master
+387
-324
Customize query in gerrit
Event Timeline
Amire80
created this task.
Nov 1 2021, 1:17 PM
2021-11-01 13:17:04 (UTC+0)
Restricted Application
added a subscriber:
Aklapper
View Herald Transcript
Nov 1 2021, 1:17 PM
2021-11-01 13:17:04 (UTC+0)
Nikerabbit
moved this task from
Backlog
to
External
on the
translatewiki.net
board.
Nov 1 2021, 1:18 PM
2021-11-01 13:18:59 (UTC+0)
Tacsipacsi
subscribed.
Feb 8 2022, 7:11 PM
2022-02-08 19:11:12 (UTC+0)
Comment Actions
We could start with marking a bunch of messages as not raw HTML (and, of course, properly escape them when displaying). The following 61 messages are marked as raw HTML, but have no HTML markup in English that would be forbidden in wikitext:
Message
Content
wikieditor-toolbar-help-heading-description
Description
wikieditor-toolbar-help-heading-syntax
What you type
wikieditor-toolbar-help-heading-result
What you get
wikieditor-toolbar-help-page-format
Formatting
wikieditor-toolbar-help-page-link
Links
wikieditor-toolbar-help-page-heading
Headings
wikieditor-toolbar-help-page-list
Lists
wikieditor-toolbar-help-page-file
Files
wikieditor-toolbar-help-page-reference
References
wikieditor-toolbar-help-page-discussion
Discussion
wikieditor-toolbar-help-content-italic-description
Italic
wikieditor-toolbar-help-content-italic-syntax
''Italic text''
wikieditor-toolbar-help-content-italic-result
Italic text
wikieditor-toolbar-help-content-bold-description
Bold
wikieditor-toolbar-help-content-bold-syntax
'''Bold text'''
wikieditor-toolbar-help-content-bold-result
Bold text
wikieditor-toolbar-help-content-bolditalic-description
Bold & italic
wikieditor-toolbar-help-content-bolditalic-syntax
'''''Bold & italic text'''''
wikieditor-toolbar-help-content-bolditalic-result
Bold & italic text
wikieditor-toolbar-help-content-ilink-description
Internal link
wikieditor-toolbar-help-content-ilink-syntax
[[Page title]]
[[Page title|Link label]]
wikieditor-toolbar-help-content-xlink-description
External link
wikieditor-toolbar-help-content-xlink-syntax
[http://www.example.org Link label]
[http://www.example.org]
http://www.example.org
wikieditor-toolbar-help-content-heading2-description
2nd level heading
wikieditor-toolbar-help-content-heading2-syntax
== Heading text ==
wikieditor-toolbar-help-content-heading2-result

Heading text


wikieditor-toolbar-help-content-heading3-description
3rd level heading
wikieditor-toolbar-help-content-heading3-syntax
=== Heading text ===
wikieditor-toolbar-help-content-heading3-result

Heading text


wikieditor-toolbar-help-content-heading4-description
4th level heading
wikieditor-toolbar-help-content-heading4-syntax
==== Heading text ====
wikieditor-toolbar-help-content-heading4-result

Heading text


wikieditor-toolbar-help-content-heading5-description
5th level heading
wikieditor-toolbar-help-content-heading5-syntax
===== Heading text =====
wikieditor-toolbar-help-content-heading5-result
Heading text

wikieditor-toolbar-help-content-ulist-description
Bulleted list
wikieditor-toolbar-help-content-ulist-syntax
* List item
* List item
wikieditor-toolbar-help-content-ulist-result
  • List item
  • List item

wikieditor-toolbar-help-content-olist-description
Numbered list
wikieditor-toolbar-help-content-olist-syntax
# List item
# List item
wikieditor-toolbar-help-content-olist-result
  1. List item
  2. List item

wikieditor-toolbar-help-content-file-description
Embedded file
wikieditor-toolbar-help-content-file-syntax
[[$1:Example.png|$2|$3]]
wikieditor-toolbar-help-content-file-caption
Caption text
wikieditor-toolbar-help-content-reference-description
Reference
wikieditor-toolbar-help-content-reference-syntax
Page text.<ref>[http://www.example.org Link text], additional text.</ref>
wikieditor-toolbar-help-content-named-reference-description
Named reference
wikieditor-toolbar-help-content-named-reference-syntax
Page text.<ref name="test">[http://www.example.org Link text]</ref>
wikieditor-toolbar-help-content-rereference-description
Additional use of same reference
wikieditor-toolbar-help-content-rereference-syntax
<ref name="test" />
wikieditor-toolbar-help-content-showreferences-description
Display references
wikieditor-toolbar-help-content-showreferences-syntax
<references />
wikieditor-toolbar-help-content-signaturetimestamp-description
Signature with timestamp
wikieditor-toolbar-help-content-signaturetimestamp-syntax
--~~~~
wikieditor-toolbar-help-content-signature-description
Signature
wikieditor-toolbar-help-content-signature-syntax
~~~
wikieditor-toolbar-help-content-indent-description
Indent
wikieditor-toolbar-help-content-indent-syntax
Normal text
:Indented text
::Indented text
wikieditor-toolbar-help-content-indent-result
Normal text
Indented text
Indented text

Only the following 8 messages
do
have markup that requires raw HTML:
Message
Content
wikieditor-toolbar-help-content-ilink-result
Page title
Link label
wikieditor-toolbar-help-content-xlink-result
Link label
[1]
http://www.example.org
wikieditor-toolbar-help-content-reference-result
Page text.[1]
wikieditor-toolbar-help-content-named-reference-result
Page text.[2]
wikieditor-toolbar-help-content-rereference-result
Page text.[2]
wikieditor-toolbar-help-content-showreferences-result
  1. ^ Link text, additional text.
  2. ^ Link text

wikieditor-toolbar-help-content-signaturetimestamp-result
--Username (talk) 15:54, 10 June 2009 (UTC)
wikieditor-toolbar-help-content-signature-result
Username (talk)
Verdy_p
subscribed.
Feb 8 2022, 7:25 PM
2022-02-08 19:25:53 (UTC+0)
Comment Actions
Even the 8 messages can be reformatted using "tvars", or a template-like (or parserfunction-like) syntax to hide the raw HTML, while still allowing to translate the rest. This is possible because we have the way to use {{#tag:
tagname
attributes
values
| ''content''}} and most part of it can be "hidden" in short tvars (like "$1") that won't confuse translators:
they will see things like "
translatable part 1
{{$1|
fully translatable part 2
}}
translatable part 3
" and will not be able to change tags or attributes like:
"{{#tag:a|href=#|title=something:Username|Username}}" that are partly masked in tvars as "{{$1:Username|Username}}".
Tacsipacsi
added a comment.
Feb 8 2022, 8:12 PM
2022-02-08 20:12:34 (UTC+0)
Comment Actions
Actually we could unify the
...-syntax
and
...-result
messages: the HTML
Page text.
<
ref
>
[http://www.example.org Link text], additional text.
<
/ref
>
is the result of parsing the wikitext
nowiki
Page text.
ref
[http://www.example.org Link text], additional text.
ref
>nowiki
while
Page text.
sup
><
href
'#'
[1]
>sup
is the result of replacing the
href
attribute after parsing the wikitext
Page text.
ref
[http://www.example.org Link text], additional text.
ref
We could just store and localize
Page text.
ref
[http://www.example.org Link text], additional text.
ref
and once parse it nowiki’d, the other time parse it not nowiki’d and replace the
href
attribute in the result. In addition to completely avoiding raw HTML, this would also mean half as much work for translators and a guarantee that the example wikitext and the example result will really be the same.
For the signature, we also need to localize the string
Username
(it doesn’t appear in the source text, but appears once as link text and twice as link title in both results). The timestamp should be auto-generated, which also means that it will be in content language, not UI language, matching the actual signatures’ behavior.
Verdy_p
added a comment.
Edited
Feb 8 2022, 8:51 PM
2022-02-08 20:51:39 (UTC+0)
Comment Actions
Note that user names HAVE to be Bidi-isolated (many users have variable scripts, let's remember that SUL is effective now in Wikimedia, so users with Arabic names appear in English Wikipedia, as well there are names using Latin, and all can be mixed up if theses names also embed punctuations and other characters with "weak" directions (notably on leading positions, while trailing positions will have an effect the content after the user name).
The same would apply to translatable titles in the internal text of a link, or other attributes like captions of images. Finally some languages require *additional* markup (not present in the English source, for example "sup" elements for abbreviations of ordinal numbers, or even more for complex languages like "hiero" which is fully unusable as plain-text without this markup, or others that require specific layouts).
Verdy_p
added a comment.
Edited
Feb 8 2022, 9:01 PM
2022-02-08 21:01:46 (UTC+0)
Comment Actions
Consider Traditional Mongolian: normally it uses a vertical layout. But when it is inserted inside a document uwing an horizontal layout, its conversion to horizontal behaves differently in a RTL context, and in a LTR context, because the Mongolian glyphs and direction will be rotated upside down (180 degrees).
When the script is sinographic or Hangul, Kanas and Bopomofo, this is different: glyphs are normally not rotated. only the baseline changes (and different metrics lines)
Now insert Latin/Greek/Cyrillic or Arabic in a Mongolian vertical text: this time Latin and Arabic will be rotated differently, or the Latin/Greek/Cyrillic script may not rotate its letters but will align them vertically like in crosswords...
And in all these cases, there are substitutions of some punctuations and symbols, not all.
These complexities for vertical presentations are still not even solved in existing HTML/CSS specs (or in Unicode specs for BiDi, which just assume an single horizontal direction, and does not even consider the case of Boustrophedon, and scripts with varaible directions like Old Greek and Late Phoenician derivatives, and then just assumes that all vertical scripts handle like LTR, i.e. like Sinograms/Kanas/Hangul)
So the assumption that a translation should only contain "plain-text" and translators must not insert any additional markup (even if its required) brings complications that designers that were not trained with multingual knowledge and awareness of layout constraints often forgot. This goes up to the design of HTML itself (including HTML5) with its old concept of "inline contents" and "block contents", and in CSS (left vs right, when the actual distinction would be top vs. bottom and changinc depending on the context between start vs. end, or the supposed existence of a single vertical or horizontal alignement of baselines).
Translators can do the best they can, sometimes even their demands are ignored because some monolingual technicians asked them to not use any markup (or not alter the existing one which was tested only for English or Chinese: users of Arabic/Hebrew/Divehi/N'ko know what this means for them, as well students of paleographic languages: they simply don'r use HTML, and still publish using 2D-aware document formats, later rendered as PDFs, but for communications they use their own proprietary solutions, or what we illegimitaly call "hacks", or have to abandon their scripts and talk using other scripts or languages).
Tacsipacsi
added a comment.
Feb 8 2022, 10:20 PM
2022-02-08 22:20:56 (UTC+0)
Comment Actions
I don’t think I assumed anywhere that the messages contain plain text only–or do you mean nowiki’ing the result? But the nowiki’d text should represent wikitext input, and you can’t use any markup to control how wikitext is displayed in the