⚓ T68786 Rename revision_userindex to revision
Page Menu
Phabricator
Create Task
Maniphest
T68786
Rename revision_userindex to revision
Closed, Declined
Public
Actions
Edit Task
Edit Related Tasks...
Create Subtask
Edit Parent Tasks
Edit Subtasks
Merge Duplicates In
Close As Duplicate
Edit Related Objects...
Edit Commits
Edit Mocks
Mute Notifications
Protect as security issue
Assigned To
coren
Authored By
yuvipanda
Jun 18 2014, 2:12 PM
2014-06-18 14:12:00 (UTC+0)
Tags
Toolforge
(Backlog)
Cloud-Services
(Toolforge)
Referenced Files
None
Subscribers
Betacommand
Halfak
Merl
Petrb
Ricordisamoa
scfc
Springle
Description
I don't see a reason anyone would want to use the unindexed 'revision' vs 'revision_userindex'. We can rename 'revision' to revision_noindex, and revision_userindex to revision.
Same for the logging table.
The problem this causes is tools/users running queries on revision without reading the docs about revision_userindex, and both getting terrible performance themselves and slowing down the server as well.
Version
: unspecified
Severity
: normal
Details
Reference
bz66786
Related Objects
Search...
Task Graph
Mentions
Status
Subtype
Assigned
Task
Resolved
jcrespo
T50930
Database replication problems - production and labs (tracking)
Declined
coren
T68786
Rename revision_userindex to revision
Mentioned In
T199588: COUNT on revision is different than revision_userindex
Event Timeline
bzimport
raised the priority of this task from
to
Needs Triage
Nov 22 2014, 3:22 AM
2014-11-22 03:22:26 (UTC+0)
bzimport
added a project:
Toolforge
bzimport
set Reference to bz66786.
yuvipanda
created this task.
Jun 18 2014, 2:12 PM
2014-06-18 14:12:00 (UTC+0)
Halfak
added a comment.
Jun 18 2014, 2:18 PM
2014-06-18 14:18:37 (UTC+0)
Comment Actions
Agreed. I don't see any good reason why the non-indexed thing behind "revision" should occupy such a privileged name. Is it actually useful for querying?
If we can keep "revision_userindex" and make "revision" an alias to it, we can preserve backwards compatibility as well.
Merl
added a comment.
Jun 18 2014, 2:40 PM
2014-06-18 14:40:24 (UTC+0)
Comment Actions
According to doc revisions with suppresed username (by os, steward) )are currently only available from revision and not revision_noindex.
But i agree that the indexed version should be the default one because most people don't take care when writing sql queries and the special case to have really all revision available is less important because
yuvipanda
added a comment.
Jun 18 2014, 2:41 PM
2014-06-18 14:41:10 (UTC+0)
Comment Actions
(In reply to merl from comment #2)
According to doc revisions with suppresed username (by os, steward) )are
currently only available from revision and not revision_noindex.
I think you mean revision_userindex :)
coren
added a comment.
Jul 29 2014, 1:02 PM
2014-07-29 13:02:41 (UTC+0)
Comment Actions
The naming convention was selected on the "least surprise" principle; there are a number of rows not available in the _%index views and those rows would be missing without explanation.
That said, I'd have no fundamental objection to switching the names around if you think that will make things clearer.
I wish MariaDB could be told 'use that view if the query contains a where clause on those columns' :-)
yuvipanda
added a comment.
Jul 29 2014, 1:04 PM
2014-07-29 13:04:25 (UTC+0)
Comment Actions
Yeah, I think the rename will actually give people the 'least surprise' - I think running a query you expect to be fast but ends up being super slow and then having to go dig around why is more surprising.
coren
added a comment.
Jul 29 2014, 1:21 PM
2014-07-29 13:21:43 (UTC+0)
Comment Actions
That's not what I meant (performance); currently selection on 'revision' will give you the same result rows as making that same query in production whereas use of 'revision_userindex' may not.
Halfak
added a comment.
Jul 29 2014, 1:50 PM
2014-07-29 13:50:26 (UTC+0)
Comment Actions
Let me just say that I'm very surprised that either of the tables is missing rows that the other has. Why would "userindex" suggest that some rows have been filtered?
coren
added a comment.
Jul 29 2014, 1:53 PM
2014-07-29 13:53:35 (UTC+0)
Comment Actions
The rows that are filtered in the _userindex views are where the user id and user named have been NULLed because of suppression. You'd never *notice* the missing rows when using _userindex because you'd use that view while having a WHERE clause on those columns and the rows wouldn't have been returned /anyways/.
If, on the other hand, you have a where clause on the page columns (to recover a article history), then the rows with suppressed usernames would still show in their proper place as you'd get in production / on-wiki.
Halfak
added a comment.
Jul 29 2014, 2:01 PM
2014-07-29 14:01:53 (UTC+0)
Comment Actions
Marc, as a regular user of this data, that's very unintuitive.
Also:
SELECT COUNT(*) FROM revision_userindex WHERE rev_user IS NULL;
coren
added a comment.
Jul 29 2014, 2:02 PM
2014-07-29 14:02:47 (UTC+0)
Comment Actions
That last paragraph should have ended "... if you are using the revision table, but mysteriously missing if you are using revision_userindex".
coren
added a comment.
Jul 29 2014, 2:03 PM
2014-07-29 14:03:13 (UTC+0)
Comment Actions
rev_user may be NULL for reasons /other/ that suppression. :-)
coren
added a comment.
Jul 29 2014, 2:07 PM
2014-07-29 14:07:21 (UTC+0)
Comment Actions
(In reply to Aaron Halfaker from comment #9)
Marc, as a regular user of this data, that's very unintuitive.
Indeed it is, albeit clearly documented at:
This is what I meant by 'least surprise'; if you use the revision view, you get the expected (every row) result.
If you have a where clause on rev_user or rev_user_text (that isn't IS NULL for the obvious reason) then you *also* get every row you would have gotten from revision.
The unintuitive result occurs /only/ if you use revision_userindex while not also selecting on the value of rev_user. This is why the _userindex table was not made the default.
Betacommand
added a comment.
Jul 29 2014, 2:11 PM
2014-07-29 14:11:10 (UTC+0)
Comment Actions
Honestly this is a case where we just need to train our users to use the correct view. Nothing should be changed on the server side. As it should reflect production on production tables.
From what I have seen from tools labs our documentation/how to/support system is in need of a complete overhaul and a lot of improvements need to be made.
This is an easy case where the two tables play different roles and the users who use them should know the differences. Im not sure where the indexes are (rev_user, rev_user_text, multi-colum, or dual indexes) but depending on what your looking for those indexes may or may not help you.
coren
added a comment.
Aug 28 2014, 3:27 PM
2014-08-28 15:27:54 (UTC+0)
Comment Actions
After careful consideration, the necessity of having revision provide the same view as in production wins over the potential confusion of having some rows missing unless you query a different view.
Ricordisamoa
subscribed.
Jul 18 2015, 11:06 AM
2015-07-18 11:06:41 (UTC+0)
Restricted Application
added a project:
Cloud-Services
View Herald Transcript
Jul 18 2015, 11:06 AM
2015-07-18 11:06:42 (UTC+0)
Phabricator_maintenance
removed a subscriber:
yuvipanda
Jun 7 2017, 6:59 PM
2017-06-07 18:59:40 (UTC+0)
MusikAnimal
mentioned this in
T199588: COUNT on revision is different than revision_userindex
Jul 14 2018, 1:15 AM
2018-07-14 01:15:24 (UTC+0)
Log In to Comment
Content licensed under Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 unless otherwise noted; code licensed under GNU General Public License (GPL) 2.0 or later and other open source licenses. By using this site, you agree to the Terms of Use, Privacy Policy, and Code of Conduct.
Wikimedia Foundation
Code of Conduct
Disclaimer
CC-BY-SA
GPL
Credits