Linked Data for Language Technology Comm

Linked Data for Language Technology Community Group
Skip to toolbar
Skip
My W3C Account
Linked Data for Language Techn...
Linked Data for Language Technology Community Group
This group aims to consult with current and potential users of linguistic data to assemble use cases and requirements for Language Technology Applications that use Linked Data. The results will be used to guide future interoperability, research and development activities spanning the language technology and linked data domains. Potential users are companies and public bodies involved in natural language processing, language resources, content management, the language services and localisation industry and other applications of content analytics techniques used in search, recommender systems, sentiment analysis and terminology management. The group does engage with users through surveys, international events and training activities organized in conjunction with partners from academia or industry, resp. designated research projects and networking efforts (esp., EU or other multi-national projects). We identify use case and requirements priorities, technology gaps and interoperability roadblocks. We work towards community group reports that describe our findings and/or solutions to the challenges identified in our work.
ld4lt
Group's public email, repo and wiki activity over time
Note: Community Groups are proposed and run by the community. Although W3C hosts these
conversations, the groups do not necessarily represent the views of the W3C Membership or staff.
Chairs, when logged in, may publish draft and final reports. Please see
report requirements
Linguistic Linked Data in selected domains: 5th and 6th LIDER Roadmapping Workshops to be held in July 2015
David Lewis
Posted on:
June 26, 2015
In the last 1 1/2 years, the LIDER project has organized several roadmapping events to gather a broad community around the topic of linguistic linked data. In July this year, LIDER will engage with two selected communities. On July 6, the 5th LIDER roadmapping workshop will be held in Rome at Sapienza University of Rome. The topic will be
cross-media linked data
and the event will provide several high level speakers from the multimedia area. On July 13th LIDER will organize the 6th roadmappping workshop in Munich. The event will be hosted by Siemens and will focus on
content analytics and linked data in healthcare and medicine
For both workshops participation is limited. If you are interested in the Rome event please contact
Tiziano Flati
, for Munich please contact
Philipp Cimiano
Roadmapping Linked Data in the Localisation Industry Dublin 4th June’14
David Lewis
Posted on:
May 27, 2014
Following successful road-mapping workshops at the
European Data Forum
and the
Multilingual Web conference
, the LD4LT group will next be engaging with the Localisation Industry on the 4th June’14 in Dublin, Ireland. This will be part of the
FEISGILTT workshop
on harmonising localisation industry standards, which is co-located in
Localisation World
, the industry’s premiere trade show. The focus will be on use cases where linked data can contribute to interoperability for localisation tools and processes, including terminology management and parallel text curation.
LD4LT at LREC’14
David Lewis
Posted on:
May 27, 2014
The use of Linked Data for Language Resources is a hot topic at the Language Resources and Evaluation Conference (
LREC’14
), 26-31 May, Reykjavik, Iceland. The week kicked of with a
LD4LT Tutorial
. All LREC delegates are able to contribute by completing a questionnaire included in the conference bag (and enter a prize draw) and can talk to existing LD4LT members at the LIDER/FALCON booth in the EU Village.
Two surveys released on the future of language technologies on the web
David Lewis
Posted on:
May 1, 2014
The
results
of an initial survey conducted by the
LIDER project
into requirements and use cases related to linguistic linked data are now available. It is interesting to read this in tandem with
results of a survey
just released by
LT-Innovate
on interest in a European Language Cloud.
Linked Data for Language Technology Roadmapping Workshop, 21st March 2014 Athens, Greece
David Lewis
Posted on:
February 18, 2014
The
Linked Data for Language Technology community
is organising a
roadmapping workshop
on 21st March in Athens, to build a better understanding of the potential synergies and co-evolution paths for
language technologies,
such as machine translation, information extraction and sentiment analysis, and
linked data
. Language technologies are key to extracting information from unstructured content in different languages to form linked data, while linked data can aid the discovery and sharing of the language resources that underpin language technologies.
Who should attend?
Any organisation interested in automated extraction of data from unstructured digital content, especially content in more than one language and including multimedia as well as textual content. Organisations engaged in the market for language technologies applied beyond English-language content and data. All these can benefit from more open access to linked language resources.
How can you participate?
You can register for the event
here
. If you wish to present a similar statement you can indicate this on your registration form. The event will then proceed in an structured open format to identify and capture from participants their use case priorities and interoperability, best-practice and technology gaps they face. An
online survey
is currently open for gathering industry view on use case prioritation. You can also contribute directly by joining the
Linked Data for Language Technology community
at the w3C
Programme and Topics:
The workshop will open with keynotes from Hans Uszkoreit who is Scientific Director DFKI, Nicoletta Calzolari Director of Research CNR, Phil Archer who is leading the W3C Data Activity and Asun Gomez-Perez UPM who is leading the LIDER coordination action on linguistic linked data. This will be followed by short briefing from four existing international communities working in this area, by position statements from companies about existing use cases and by an open workshop session to establish use case priorities.
The language resource community has already made a concerted attempt to catalogue different data sets through the
META-SHARE initiative
. It has tackled the need for common meta-data for linguistic corpora of various types and has paid particular attention to encoding the different usage rights that exist across governmental, academic and commercial data sources. This initiative is therefore well primed to exploit linked data technologies being standardised by the
W3C Data Activity
to further open the cataloguing and discovery of language resources.
This is particularly timely as the European Commission has launched it new H2020 funding programme with a strong support available for innovation and research in the
open data and language technology space
. In April 2014 it will also launch its Connecting Europe Facilities programme, with €1Billion for funding new pan-European digital services, including open data exchange and automated translations services. In both these initiatives, strong, open solutions for the interoperability of language resources as open web data will be key.
The workshop we take a use case driven approach to key questions around the synergies possible between the W3C’s open web data standards and existing approaches to sharing language resources and applying them for training language technologies:
How can language resource sharing infrastructure, such as META-SHARE, migrate to a linked data approach so as to benefit from more robust, decentralised and scalable publication and search features?
How well can existing linked data vocabularies such as
Creative Commons Rights Expression Language
and
Linked Data Right
support the usage rights models established for language resources?
How far can language resource meta-data be supported by the
Data Catalogue Vocabulary
or the
Vocabulary of Interlinked Datasets
How can emerging onto-lexical resources such as
BabelNet
be usefully interlinked with individual terms in existing language resources?
How can the process of locating and managing language resources to train language technologies be eased and optimised by vocabularies such as the
Provenance Ontology
or the
Provenance and Plans Ontology
for repeatable data workflows.
However these are just a sample of the many issues and viewpoints that will have a bearing on the future of Linked Data for Language Technoloiges, and we hope you will be able to join us in Athen to share yours.
Regards,
Dave Lewis
Industry Survey Launch
David Lewis
Posted on:
January 18, 2014
Organisations world-wide are struggling to better use the WWW to engage in meaningful online conversations with customers and citizens. To do this in a scalable and cost effective way, many are turning to
automated language technologies
. These can assist in: discovering/extracting information; understanding opinions/trends; processing and managing multilingual/multimedia content and data; and monitoring/forcasting topics of interest.
However, if you are already considering or using language technologies you will understand the key role played by
data
in training automated language technologies to meet the needs of your specific application. Locating, collecting and determining the quality and relevance of such linguistic data therefore forms a major cost, and a barrier, for the successful use of language technologies.
Open linked data on the web, using standards developed by the W3C, may offer an ideal solution to discover and exchange linguistic data across a wide range of commercial and governmental applications. However, establishing international best-practice and developing open technical specifications requires a much better understanding of these different applications and their requirements.
To this end, a new W3C community group has been formed to assemble and discuss use cases and data handling requirements for language technology applications. We invite you now to join the
Linked Data for Language Technology
(LD4LT) group and engage in these activities. You can provide an indication of your particular interests and requirements via the initial survey at
We also invite you to participate in any of the upcoming road-mapping workshops being organised by the group at the following events:
LD4LT Group kick-off
21 March in Athens, Greece, co-located with the
European Data Forum 2014
7-8 May in Madrid, Spain, as part of the
7th W3C Multilingual Web Workshop
3 June in Dublin, Ireland, co-located with the
LocWorld conference
The results of these consultations will be published for discussion via the LD4LT group. They will provide a roadmap for other interoperability, research and platform development activities spanning the language technology and linked data domains. These activities will include the W3C
OntoLex
and
Best Practice in Multilingual Linked Open Data
community groups as well as future EU-funded collaborations under the H2020 programme.
Hello world!
Ian Jacobs
Posted on:
December 9, 2013
Welcome to
Community and Business Groups
. This is your first post. Edit or delete it, then start blogging!
Tools for this group
Learn about available Community Group tools and how to configure a group's site to include links to tools on w3.org or elsewhere.
Mailing List
@ public-ld4lt
@ internal-ld4lt
Wiki
IRC
GitHub
RSS
Contact This Group
Pages
Sample Page
Get involved
Learn more about how to join a group.
Anyone may join this Community Group. All participants in this group
have
signed the
W3C Community Contributor License Agreement
Join or Leave this group
Chairs
Andreas Blumauer
Penny Labropoulou
Christian Chiarcos
Participants (
94
View
all participants
Archives
June 2015
May 2014
February 2014
January 2014
December 2013
Categories
Announcements
Uncategorized
Footer Navigation
Standards
Groups
Get involved
Resources
News & Events
About W3C
Contact W3C
Contact
Help
Support us
Legal & Policies
Corporation
Systems Status
W3C Updates