Sophox - OpenStreetMap Wiki
Sophox
From OpenStreetMap Wiki
Jump to navigation
Jump to search
Help
Sophox
In other languages
Afrikaans
asturianu
azərbaycanca
Bahasa Indonesia
Bahasa Melayu
bosanski
brezhoneg
català
čeština
Crnogorski
dansk
eesti
Esperanto
euskara
Frysk
Gàidhlig
galego
hrvatski
Ido
interlingua
íslenska
italiano
Kreyòl ayisyen
kréyòl gwadloupéyen
kurdî
latviešu
Lëtzebuergesch
lietuvių
magyar
Nederlands
norsk
norsk nynorsk
occitan
polski
português
română
sardu
shqip
slovenčina
slovenščina
srpski (latinica)
suomi
svenska
Tagalog
Tiếng Việt
Türkçe
Zazaki
Ελληνικά
беларуская
български
македонски
монгол
русский
српски / srpski
українська
հայերեն
עברית
العربية
سرائیکی
فارسی
پنجابی
پښتو
नेपाली
मराठी
हिन्दी
বাংলা
ਪੰਜਾਬੀ
தமிழ்
മലയാളം
සිංහල
ไทย
မြန်မာဘာသာ
한국어
ქართული
ⵜⴰⵎⴰⵣⵉⵖⵜ
中文(简体)
中文(繁體)
粵語
Other languages...
Sophox
Author:
Sophox/sophox/graphs/contributors
GitHub
License:
Apache License 2.0
Status:
Active
Website:
sophox.org
Source code:
Sophox/sophox
GitHub
Programming languages:
Python, Rust, and JavaScript
A collection of services exposing OSM data, metadata, and other microservices
Features
Feature
Value
Map Display
Routing
Navigating
Tracking
Monitoring
Editing
Rendering
Error missing media source
Short video on getting started
Sophox is a collection of data services to work with OSM data, OSM tag metadata (from this wiki), and use external (federated) data sources like
Wikidata
. The service uses a well known SQL-like language called
SPARQL
. The service can also be used for
direct OSM editing
, either in a manual "search & replace", or as a challenge similar to
MapRoulette
. All Wikidata SPARQL documentation is at
Wikidata Query Help.
Similarly to
Overpass-Turbo
, this service can be used directly from JOSM (enable Wikipedia plugin).
This short getting-started video
(thumbnailed on the right) should give a quick intro. You can also use the "examples" button at
to try the service using
these example queries
This service is made available thanks to a generous hardware donation by
Elastic
(of Elasticsearch fame).
Contents
Important Changes
Current Challenges
Hello (OSM SPARQL) World
Using from JOSM
4.1
Examples:
How OSM data is stored
Other Data Sources
6.1
OSM Metadata
6.2
Key Usage Statistics
6.3
Wikipedia Page View Counters
External Data Sources
7.1
Overpass Integration
7.2
Parameters
Generating Polygon Files
Quick-fix Editor
10
Other Queries
11
Current limitations
12
See also
13
External links
Important Changes
Sophox no longer contains a copy of the Wikidata data due to space and performance limitations. Use federated query to call out into Wikidata's sparql endpoint.
Current Challenges
See
Wikipedia Improvement Tasks
Links to disambiguation pages
- update to the correct Wikipedia & Wikidata items
Links to plants
- should use
species:wikidata
=*
instead.
Brands as "wikidata" or "operator:wikidata"
- these objects should use
brand:wikidata
=*
Hello (OSM SPARQL) World
Here is a simple example of using
SPARQL
, the
Wikibase RDF format
and
OSM RDF format
. This query lists a few OSM place objects, and can be executed by clicking "run it" underneath it. (Compare it with a similar
Overpass Query
.)
There are many more
query examples
on how to cross-reference OSM data with Wikidata, use Wikipedia page popularity, perform quality control queries, and other.
# A sample query to list all OSM objects with a place tag.
# The * will show all variables that are used in the WHERE clause
select
where
# The first expression matches the three columns of the Triple Store with:
# "find any subject that has an OSM 'place' tag, and any value of that tag"
# Question mark represents a variable. ?osm becomes a URI for an OSM object, e.g.
# (Everest)
# The same URI may also be written with a prefix - osmnode:2681940767
# All OSM tags use osmt: prefix. Using osmt:place constant only
# matches OSM object with a place tag. The ?place variable will
# become the value of the place tag. You can use a constant instead of a variable,
# e.g. substitute ?place with a constant string 'city'.
?osm
osmt
place
?place
# osmm: prefix means "meta values", e.g. OSM object type, location (loc), version, ...
# Uncomment to limit to a specific type of an OSM object ('n', 'r', 'w').
# ?osm osmm:type 'r'.
# osmm:loc is a centroid point for most OSM objects. If included, the results
# can be shown on a map by switching to the map view (right under the query)
# ?osm osmm:loc ?coordinates.
# The FILTER expressions allow complex filtering beyond a simple "must match".
# filter(?place = 'town' || regex(?place, '[a-z][0-9]', 'i'))
# This will limit results to places which do not have a `name:en` tag.
# filter not exists { ?osm osmt:name:en ?nameen. }
# For OSM objects with the wikidata tag, find the English name of that object in Wikidata
# if it exists.
?osm
osmt
wikidata
?wd
# With that get from the Wikidata SPARQL endpoint service what type of the object it
# is (P31 is the property instance of in Wikidata), and the label for that type,
# also if exists in English. For labels and descriptions, you may instead use a
# service that picks first available language based on a list.
# service {
# ?wd wdt:P31 ?type.
# optional { ?wd rdfs:label ?label. filter(lang(?label) = "en") }
# optional { ?type rdfs:label ?typelabel. filter(lang(?typelabel) = "en") }
# }
# Similar to SQL, queries can use aggregations, subqueries, and limit the results
limit
10
Run it
edit query
Using from JOSM
This service can be used directly from
JOSM editor
Install
Wikipedia plugin
and enable
expert mode
(in the
view
menu)
Use File / Download data /
Download from Sophox API
tab.
JOSM will download all OSM object IDs it finds in the query result. All other values will be ignored, and should not be requested to reduce server load. There are several magic keywords to help with the query:
{{boxParams}}
: use it with
wikibase:box service
. Your map selection will be converted to two lines for cornerWest and cornerEast parameters.
{{center}}
: the center of the map selection will be inserted as a point constant.
Examples:
These queries only run from JOSM because they use {{*}} parameters.
Running a query with a box service on large areas is often slower than getting all objects worldwide with a specific condition
OSM objects are not humans, but humans could be their
subject
, so
subject:wikidata
should be used. This query finds objects tagged with human in wikidata tag within the selected area. This query is very slow if used on a large area.
# Find objects tagged as humans in a selected area (very slow on large areas)
SELECT
?osmid
WHERE
SERVICE
wikibase
box
?osmid
osmm
loc
?location
{{
boxParams
}}
?osmid
osmt
wikidata
?wd
?wd
wdt
P31
wdt
P279
wd
Q5
This query use s a simple calculation and filter to find objects that form a ring around the selection's center point.
# Find educational places at least 2, and at most 3 km from the center of the selection
SELECT
?osmid
WHERE
VALUES
?amenity
"kindergarten"
"school"
"university"
"college"
?osmid
osmt
amenity
?amenity
osmm
loc
?loc
BIND
geof
distance
({{
center
}},
?loc
as
?distance
FILTER
?distance
&&
?distance
Same as above, but using "wikibase:around" service.
# Find educational places at least 2, and at most 3 km from the center of the selection
SELECT
?osmid
WHERE
VALUES
?amenity
"kindergarten"
"school"
"university"
"college"
?osmid
osmt
amenity
?amenity
SERVICE
wikibase
around
?osmid
osmm
loc
?loc
bd
serviceParam
wikibase
center
{{
center
}}
bd
serviceParam
wikibase
radius
"3"
bd
serviceParam
wikibase
distance
?distance
FILTER
?distance
How OSM data is stored
All data is stored in a
Triplestore
as
subject predicate object.
statements. For example, a statement
relation #123 has a tag "name" set to "value"
can be represented as
subject
(relation #123),
predicate
(a tag "name"), and
object
("value"). Both the subject and the predicate parts of the statement must always be a complete URI, e.g.

(way #42), or

(tag wikidata). To make the URI more readable, we use prefixes to shorten it, e.g.
osmway:42
and
osmt:wikidata
. The object part of the statement can be either a value (string/number/boolean/geo coordinate/...), or, just like the first two parts, a URI. This way one statement's object could be another statement's subject, creating a linked graph. In SPARQL, each statement must end with a period, but if multiple statements have the same subject, we can separate them with a semicolon. The prefixes are defined in the engine, but could be specified manually for clarity.
Note:
All RDF data is stored in a single giant "subject/predicate/object" table. That data includes OSM data, OSM tags metadata from this wiki, Wikipedia pageviews statistics, and other. So the subject could be
osmnode
1234
osmd
Q42
, sitelinks, or many other types of URIs. The easiest way to get just the OSM objects is to use
osmm
type
predicate - all OSM subjects will have one. But for performance reasons, don't use more filters than you need - if you are querying for any OSM objects with a
osmt
wikipedia
, you don't need to specify the
osmm
type
, because
osmt
wikipedia
already implies an OSM object.
osmnode
1234
osmm
type
'n'
osmnode
1234
osmm
loc
'Point(32.1 44.5)'
^^
geo
wktLiteral
# longitude/latitude
osmnode
1234
osmt
name
'node name tag'
osmnode
1234
osmt
name:en
'node name:en tag'
osmnode
1234
osmt
wikipedia

osmnode
1234
osmt
wikidata
wd
Q34
osmway
2345
osmm
type
'w'
osmway
2345
osmm
loc
'Point(32.1 44.5)'
^^
geo
wktLiteral
# way's centroid
osmway
2345
osmm
isClosed
true
# is this way an area or a line?
osmway
2345
osmt
name
'way name tag'
osmway
2345
osmt
name:en
'way name:en tag'
osmway
2345
osmt
wikipedia

osmway
2345
osmt
wikidata
wd
Q34
osmrel
3456
osmm
type
'r'
osmrel
3456
osmm
has
osmway
2345
# relation contains way #2345
osmrel
3456
osmway
2345
"inner"
# way #2345 has an "inner" role
osmrel
3456
osmm
loc
'Point(32.1 44.5)'
^^
geo
wktLiteral
# centroid of all members
osmrel
3456
osmt
name
'way name tag'
osmrel
3456
osmt
name:en
'way name:en tag'
osmrel
3456
osmt
wikipedia

osmrel
3456
osmt
wikidata
wd
Q34
# Other common values:
...
osmm
version
42
# Last object's version
...
osmm
changeset
1234567
# Last edit of this object in this changeset
...
osmm
timestamp
'2017-08-23T01:02:03+00:00'
^^
xsd
dateTime
# Last edit time
...
osmm
user
'UserName'
# Name of the user - could change
Other tags:
osmm
badkey
"@Sammelmütze"
- any unsupported tag names will be stored here as "osmm:badkey" objects, without their own values. For a tag to be supported, it must match
/^[0-9a-zA-Z_]([-:0-9a-zA-Z_]*[0-9a-zA-Z_])?$/
regex -- contain only Latin letters, digits, and underscore. Dashes "-" and colons ":" may also be used except in the first and last positions.
osmm
loc:error
"Error: ..."
- whenever node or way's location could not be calculated, this property contains relevant error info.
Other Data Sources
OSM Metadata
This wiki's
data items
, such as key and tag description, is also imported into Sophox using the same method as
Wikidata Query Service
, except that it uses slightly different namespaces, e.g.
wd → osmd
and
wdt → osmdt
. Check out an excellent
SPARQL wikibook
A special
osmd
...
osmm
key
osmt
:...
is automatically added to tie the data item with the OSM data.. For example
bridge
(Q103)
has
osmd
Q103
osmm
key
osmt
bridge
Key Usage Statistics
All metadata keys include their usage statistics from
Taginfo
. For example,
bridge
(Q103)
has these extra values, as defined in
Taginfo API
osmd
Q103
osmm
count_all
3621295
# Number of objects in the OSM database with this key.
osmd
Q103
osmm
count_all_fraction
0.0007
# Number of objects in relation to all objects.
osmd
Q103
osmm
count_nodes
4790
# Number of nodes in the OSM database with this key.
osmd
Q103
osmm
count_nodes_fraction
# Number of nodes in relation to all tagged nodes.
osmd
Q103
osmm
count_relations
742
# Number of ways in the OSM database with this key.
osmd
Q103
osmm
count_relations_fraction
0.0001
# Number of ways in relation to all ways.
osmd
Q103
osmm
count_ways
3615763
# Number of relations in the OSM database with this key.
osmd
Q103
osmm
count_ways_fraction
0.0066
# Number of relations in relation to all relations.
osmd
Q103
osmm
users_all
62161
# Number of users owning objects with this key.
osmd
Q103
osmm
values_all
297
# Number of different values for this key.
Wikipedia Page View Counters
Most popular pages have pageview counters stored in Sophox, making it possible to order results by popularity in a specific wiki. Note that the count has no meaning in itself, only as relative to other pageviews. This way one can see the importance/prominence of a city vs another city for labeling purposes.

pageviews
12345
External Data Sources
Overpass Integration
The Overpass service can produce CSV-formatted results by using [csv:...] directive. You can create a simple data URL in Overpass Turbo by clicking "export" and copying the "raw data directly from Overpass API" link. This produces a hard-to-read URL-encoded query string. Alternatively, you can use the original query as a
otQuery
parameter. Specifying otQuery sets other defaults to parse tab-separated results.
out
csv
::
id
::
type
'name'
)][
timeout
25
];
node
"tourism"
"museum"
](
{bbox
);
way
"tourism"
"museum"
](
{bbox
);
relation
"tourism"
"museum"
](
{bbox
);
);
out
Use
Export / Query / convert compact
link to get the query as a single string to convert the
extended shortcuts
like {{bbox}} into a regular Overpass QL query. The
above query
produced a
list of museums
, and we use its output in this SPARQL query to find the "instance of" (P31) of the corresponding Wikidata entries.
SELECT
WHERE
# We need to set query optimizer to either "None" or "Runtime". The default "Static" optimizer
# will try to get data from the service after pulling all locations with wikidata, which results
# in Out Of Memory error. Instead, it should get the service first (same order as the query = None),
# or do on-the-fly performance optimization with "Runtime". See Blazegraph help:
# https://wiki.blazegraph.com/wiki/index.php/QueryOptimization#Join_Order_Optimization
hint
Query
hint
optimizer
"None"
# Get a list of museums in a predefined area from Overpass
SERVICE
wikibase
tabular
bd
serviceParam
wikibase
otQuery
'[out:csv(::id,::type,"name")][timeout:25];(node["tourism"="museum"](40.69,-74.09,40.76,-74.02);way["tourism"="museum"](40.69,-74.09,40.76,-74.02);relation["tourism"="museum"](40.69,-74.09,40.76,-74.02););out;'
?type
tabular
%40type
'string'
?id
tabular
%40id
'string'
# Use string instead of integer to avoid another conversion in ?osmID creation below
?name
tabular
name
'string'
# Dynamically create OSM object ID from Overpass type & id fields
BIND
URI
CONCAT
"https://www.openstreetmap.org/"
?type
"/"
?id
))
as
?osmId
# Match Overpass results with the RDF data
# (We could have retrieved wikidata & location from OT as well)
?osmId
osmm
loc
?loc
# If exists, get Wikidata ID, and the type (instance of) of that ID with a label
OPTIONAL
?osmId
osmt
wikidata
?wikidata
?wikidata
wdt
P31
?instanceOf
SERVICE
wikibase
label
bd
serviceParam
wikibase
language
"[AUTO_LANGUAGE],en"
?instanceOf
rdfs
label
?instanceOfLabel
Run it
edit query
Parameters
Input parameters:
Each parameter must be in the form
bd
serviceParam
wikibase
paramName

name
type
Description
url
string
URL to the data source. Service must have either
url
or
otQuery
parameter.
otQuery
string
Overpass QL query that returns a
CSV-formatted
results. The entire query must be written as a single line. In the output, use url-encoded column names for special characters. (e.g.
tabular:%40id
for
@id
).
format
string
Which
CSVFormat
profile to use for parsing. Uses
DEFAULT
for regular urls, and
TDF
for the Overpass queries
firstRowIsHeader
boolean
If true, uses first row for column names. By default, OT queries will set this to true, and url requests will use the default from the format.
ignoreSurroundingSpaces
boolean
The trimming behavior, true to remove the surrounding spaces, false to leave the spaces as is. See
doc
commentMarker
char
Sets comment start marker to the specified character. Comment start character is only recognized at the start of a line. See
doc
delimiter
char
Sets delimiter character. See
doc
escape
char
Sets escape character. See
doc
quote
char
Sets quote character. See
doc
useColumnNames
boolean
If true, use
tabular:columnName
to access columns. Otherwise, use
tabular:N
, where N is a 0-based column index. By default, this value is the same as firstRowIsHeader.
Output parameters:
All output parameters must be in the form of
?variable
tabular
columnName
"dataType"
or
?variable
tabular
columnIndex
"dataType"
. The
useColumnNames
parameter controls if it should be the
columnName
or
columnIndex
. Allowed data types: string, double, integer, uri.
Generating Polygon Files
If your query contains an
?id
field, and each of its values is unique, you can use it to generate a file with polygons, where each polygon's ID will be set to the
?id
field, and any additional fields will be used as polygon's properties. For example, your query could generate the list of US states, and for each state get the name of the governor as a
?governor
field. The result will be a file with 50 features, and each feature will have a
"governor"
property. The result will be shown using
Mapshaper
. You can inspect and simplify the result, add data files, and do many other geometry manipulations.
To use this mode, add
#defaultView:MapRegions
line at the top of your query.
This query shows all Canadian provinces, with their names in English and French, their ISO 3166-2 code, and their flags (URL on Wikimedia Commons).
#defaultView:MapRegions
SELECT
?id
SAMPLE
?label_en
as
?label_en
SAMPLE
?label_fr
as
?label_fr
SAMPLE
?iso_3166_2
as
?iso_3166_2
SAMPLE
?flag
as
?flag
WHERE
# List of regions, whose sub-regions we want
VALUES
?entity
wd
Q16
SERVICE

# P150 = "contains administrative territorial entity"
# but must not have a P582 (end date) qualifier
?entity
P150
?statement
?statement
ps
P150
?id
FILTER
NOT EXISTS
?statement
pq
P582
?x
# Get labels, ISO code, and flag image
# for each sub-region, if available
OPTIONAL
?id
rdfs
label
?label_en
FILTER
LANG
?label_en
"en"
OPTIONAL
?id
rdfs
label
?label_fr
FILTER
LANG
?label_fr
"fr"
OPTIONAL
?id
wdt
P300
?iso_3166_2
OPTIONAL
?id
wdt
P41
?flag
# remove possible ID duplicates
GROUP BY
?id
Run it
edit query
Quick-fix Editor
Quick-fix editor has been fully re-implemented based on community feedback. See
Sophox Editor
Other Queries
The query service uses a community-maintained
query example list
- directly accessible and searchable from the query interface under the "examples" button.
See overview of
comparison (SPARQL vs Overpass QL)
about some simple query examples.
Current limitations
Only includes OSM nodes that have at least one tag. All ways and relations, even empty ones, are included.
The database only stores values for simple tag keys - those that only English letters, digits and symbols - : _. Tag keys that don't satisfy this criteria, e.g. those that contain spaces or accented characters, are stored as
osmm
badkey
"bad tag key"
, without their values.
OSM geometries are not imported, but most objects have
osmm
loc
(center point), and ways have
osmm
isClosed
(true/false).
See also
LinkedGeoData
QLever
External links
Wikidata
Wikidata
has a Q-item on
Sophox
Sophox
Sources on GitHub
Look Both Ways: Integrating OpenStreetMap and Wikidata
at
Connect 2020
Slides and notes
Wikimedia-focused versions at
WikiConference North America 2021
and
WikidataCon 2021
Retrieved from "
Categories
Apache License
Python
Rust
JavaScript
Sophox
Hidden categories:
Pages unavailable in German
Pages unavailable in Spanish
Pages unavailable in French
Pages unavailable in Italian
Pages unavailable in Dutch
Pages unavailable in Russian
Software with no platform specified
Navigation menu