(Get back to : Linked_data_main_page)
Object : Seminar of EuroSDR Linked Data Group - Summary
Date and venue : December 11th 2019, IGN, 73 avenue de Paris, 94 160 Saint Mande, FRANCE (Paris area). Given the strikes some participants were not present and participating thanks to gotomeeting.
- adoption of pldn as semantic sandbox. EF distribute manuals and the group sees if we need some contact; who will be adminstrator , how to ask for an account.
- write down some tender on GeoSPARQL support development for triply : the group must write down what is expected (the priorities) and how it is proposed to be evaluated.
- prepare a more general GeoSPARQL benchmark (see if others did too...), including specifically how different distances are supported in different triple stores.
- write guidelines on the support of semantic feature seach (with attribute filtering)
- write guidelines on how to publish and interconnect statistical data with topographic data (to be edited collaboratively by these who have done it and these who need to do it)
- write guidelines on the generation of ergonomic data-cards for different types of resources
- prepare more detailed application challenges :
- managing metadata with different scopes
- digital assets management
- smarter datacards
- knowledge graph: specify use cases and constraints on the documentation
Tour de table of participants :
- Erwin Folmer (Dutch Kadaster, The Netherlands),
- Bénédicte Bucher IGN (French national mapping agency), researcher (metadata), EuroSDR (Commission 4 on Information usage)
- Marie-Dominique Van Damme, IGN, teacher and researcher at ENSG, distributed architectures
- Jordi Escriu (Catalunya, Spain), GI and SDI Analyst
- Nicolas Bus (CSTB - French Research Center for Buildings) : Web semantics research lead. Working on modeling buildings, cities, infrastrucutre, territories in order to check compliance with regulations. In our team we have a PHD student working on BIM (Building information / GIS interroperability.Part of Linked Building Data group.
- Christophe Dzikowski, INSEE, (National Statistical Institut France)
- Dimitris Kotzinos (ETIS/University of Cergy Pontoise, France), Professor Lab. ETIS UMR 8051, University of Paris-Seine, University of Cergy-Pontoise, ENSEA, CNRS, & Dept. Sciences Informatiques, Université de Cergy-Pontoise, 2 av. Adolphe Chauvin, Site Saint Martin, Pontoise F-95302 Cergy-Pontoise, France
- Esa Tiainen National Lans Survey of Finland -FGI, Spatial Data Infrastructure Services
- Abdelfettah Feliachi (BRGM, the french geological survey), Interoperability & knowledge management engineer
- Mehdi Zrahl (IGN, France), PhD student working on a search engine for Geographic datasets (started october 2019)
- Pekka Latvala, National Land Survey of Finland, Department of SDI Services, GIS Expert
- Emmanuel Seguin, IGN (French national mapping agency), project manager/developer at IGNfab (http://ignfab.ign.fr/)
Presentations and discussions Erwin Folmer presents different developments at Kadaster (labs.kadaster.nl): results from a specific task of european project OpenELS that used Linked Data to facilitate the query of complex data models thanks to Linked data (and some chatbot like interface). He presented "data stories" which are specific federated sparql queries which can be run to understand the value of interconnected Linked data sets. A participant notes that providing different views on datasets ressembles to the notion of negotiation by profiles discussed in W3C WG (Abdelfettah F). Erwin also presented a facet GUI that is used to present Linked data results and that dynamically associate a graphical widget to the different properties of the dataset. A last project, Loki, is about interconnecting datasets from different authorities on a more automatic way (than writing data stories) and supporting Natural language query. To do so the approach is to build a Knowledge graph (pdok). All these results and demonstrators are openly available on the web. To answer different questions asked after the presentation : there is no strict national URI scheme in dutchland but there are some national recommandations. For Loki there is no need for specific infrastructure because the data are kept in their original stores. Yet they must be linked data or graph data. Multilinguism is experimented using Google API but does not work so well. There is some AI behind the NLP (to learn synonyms etc), they use an open source framework. This is a prototype, going on production line will probably be more tricky. Proposal from EF :
- Kadaster proposes to use as a sandbox environment for eduserv & broader : data.pldn.nl (managed from Triply).
- Possible additions could be geosparql support, data stories. If these are the technologies prioritized by the group.
- A technology development proposed is to give to triply an assignement to add geosparl support to this sandbox, possibly a simpler geosparql than all functionalities. Many participants agree on the importance of such a development. N Bus already has some use case that can be used as requirements.
- Other technology developments : implement data stories for more data sources (need to be WMS), assign triply to develop the adaptor to every country
- Other proposal : organise an event in may or june
Nicolas Bus describes a challenge that is to measure the quality and traceability of data. This would require to put metadata on triples. It is not only about the model of quality but how to attach info to triples. Technology = quadriples? RDF+? AFeliachi proposes to send infos about a solution to support edition of metadata associated to data (so it solves at least this part). An example of a tool to manage the workflow of creating and exposing vocabularies ( with their metadata) as Linked Data : https://github.com/UKGovLD/registry-core/wiki. Example of implementation https://data.geoscience.fr It is also about supporting different contributions to metadata (like vocabularies) Link to Kadaster metadata experiments: https://labs.kadaster.nl/dissemination/2019-11-26-Metadata-2-0/index.html Suggestion: try to refine a specific use case where we need to manage the quality of a product made with different sources so that we can distinguish between technologies and more open problems.
Esa Tiainen presented some proposed challenges regarding SDI (from NLS Finland) 1) Machine readibility; linked data API layer on SDI - Machine-readability and APIs provided on top of RDF triple store linking through ontology linking or Elastic search indexing - Goal: Feasible implementation of Linked data SDI • In the new national geodata platform semantic feature search with attribute filtering is implemented deploying linking as RDF triple store, where each entity in service interface schemas (features, attributes and attribute values) are converted into RDF entities and bridged (linked) to each other by annotating with their URIs. The RDF structure is then queried in the API with sparql queries. - this RDF triple store can be indexed in Elastic Search and add API on top. (Because of missing an ontology with adequate granularity as for natural or common language search, we had to make linking directly to service interface schema entities insted of using natural language conecept in an ontology) 2) A EuroSDR ’sandbox’ project to compile a common (universal) ontology UOGS among a few countries to conciliate following UOGS (Universal Ontology of Geographic Space) methodology developed by Marjan Ceh et al • using prolog or other further developed tool with NLP, user interface needed • with the same evaluate if NLS FI approach for semantic feature search with attribute filtering could be useful as a part of UOGS methodology as it is straightforward way to identify the similarities of concepts represented by data object types through their attributes, and those appearing in the real life, namely in the service interfaces that could consumed in machine readability as well • With the same introducing notion of Geographic Space (GS) as a data linking paradigm to integrate data from different sources (spatial and other type) on a certain area-of-interest, and further on present this GS-content as linked data collection (or sub collections of different views) in a graph format - basically Knowledge graph but introduced (and promoted) with spatial approach -How to address the content extent of INSPIRE data definitions as for their fragmentation, level of detail – which themes to select? - GEMET to include and comply with - to enrich and transform to a common universal ontology? 3) Metadata - needed by consumers and developers to promote linked data - Adressing data providers and SDI: Simple easy-approach guidelines for use of SHACL, Prov-O, Shex… and their roles; basically there is need for provenance information to expand use of linked data and through APIs - co-work with W3C OGC SDWIG (Spatial Data on the Web Interest Group) initiated, Toulouse Nov 19th 2019; EuroSDR Linked Data Group to contribute! • Tools to enhance access to metadata based on user profile and understanding it (different metadata for different use)
Jordi Escriu presents ongoing concerns at ICGC Need for daily maintenance of SDI components, keeping consistency. See if LD model can help. Challenge? propose a structure for a SDI components management system (one table with many different columns) that can serve the following purposes : - identification of SDI components to be updated in case there is a change, write notifications messages accordingly to this model when a change occurs - assessing the technology behind (Os or not) - INSPIRE conformance labelling -... Note from AF: BRGM is developing a geoserver plugin solution that could partly meet this concern and can share it Challenge : obtain easily initial content of RDF from existing components on the fly (Added Dec 16th, Note: RDF data can be synchronized with the source data e.g. with HTML update file or Atom feed when updating or creating a new spatial object in source data.)
Another challenge mentioned is to develop a user interface for managing changes Note from AF : Technologies exist to propagate update and subscript to changes so we may need to see how they apply to this problem
Other challenge proposed by CD : linking French statistical data with French topo data. This is something that has been implemented in other countries so here the challenge would be rather to set up a readable and reusable procedure (guidelines, good practices). For example : NLS has a process called IGALOD: https://www.efgs.info/wp-content/uploads/2019/06/Pihlajamaa_Koistinen_IGALOD-Project.pdf, or https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=13&cad=rja&uact=8&ved=2ahUKEwiQzeyI3LnmAhWt6KYKHXwYCJwQFjAMegQIAxAB&url=https%3A%2F%2Fwww.slideshare.net%2FTilastokeskus%2Fintegrating-areal-classifications-and-geospatial-data-case-igalod-jennika-leino-essi-kaukonen-tuuli-pihlajamaa-rinatammisto-and-daniel-davis-statistics-finland-and-eero-hietanen-and-kai-koistinen-national-land-survey-of-finland&usg=AOvVaw2x71PVKvq1g0vOJQ2j7gdz
- BB : Having ready to use ergonomic data-cards for different types of resources (= URI dereferencing this is an example data card at the triplydb: https://data.pldn.nl/cbs/wijken-buurten/regios/2016/id/buurt/BU03010500) (and this is ugly example at our https://brt.basisregistraties.overheid.nl/top10nl/doc/gebouw/122436259) (the technology here is what do we put on servers). Who are these cards for?, integrating possible, data stories, need to be more specific on the use case because difficulties are different
- Metadata: provenance and quality documentation : there are solutions to put metadata but the open question is the model(the ontology?) for these metadata and the interface (??) Ontology approach (i.e. metadata on datasets, objects or properties) : https://www.w3.org/TR/vocab-dqv/ Property graph approach : https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0144578 RDF approach : http://blog.liu.se/olafhartig/2019/01/10/position-statement-rdf-star-and-sparql-star/ (RDF Star not RDF plus actually) ET: Metadata – metaschema? – RDF+?; three aspects: -Provenance: SHACL, Prov-O, SHex… SHex… …connection to graph Co-operation with OGC W3C SDWIG initiated – Toulouse 19.11.2019
- Discussion on resources: masterships, need for projects
- Generation of ontologies and KG : must be done with a use case in mind because different tools exist. A use case could be the chatbot that interface users with several datasources (ex: urban climate scientists querying data about buildings (neighbour, material, inhabitants, ..). Besides, the creation must be considered (semi-automatic based on metadata?). Geographic space as one shared ontology?
- The idea of management of all data services through have linked data representations. Ontology for digital assets management, incl. alignment with wikidata and other sources Interesting to see how from this management perspective on one's assets, to solve and improve the findability and usability of the data.
General questions/remarks: concern about privacy of data.
How to handle data versioning while keeping URI persistence?
Other references mentioned during the seminar :
Eduserv course webpage (registration is open): http://www.eurosdr.net/sites/default/files/images/inline/eduserv2020_spatial_linked_open_data.pdf