Use Ontology to correctly assign data types in Virtuoso

I have ingested the Geonames RDF dump (https://download.geonames.org/all-geonames-rdf.zip) into a Virtuoso instance, and I've been running queries against it with varying degrees of success. However, I've found that certain objects have the incorrect datatype. For example, population is encoded using xsd:string, and therefore trying to sort by population ends up sorting the results in lexicographic order:

PREFIX  xsd:  <http://www.w3.org/2001/XMLSchema#>
PREFIX  gn:   <http://www.geonames.org/ontology#>

SELECT ?country ?name ?population (datatype(?population) AS ?type)
WHERE {
    ?country a gn:Feature .
    ?country gn:name ?name .

    # A.PCLI is feature  code for 'independent political entity'
    ?country gn:featureCode <https://www.geonames.org/ontology#A.PCLI> .
    ?country gn:population ?population .
}
ORDER BY DESC(?population)
LIMIT 10
country name population type
https://ift.tt/3Fg8k6V China 1330044000 https://ift.tt/2VRgVvo
https://ift.tt/3c4uLiM India 1173108018 https://ift.tt/2VRgVvo
https://ift.tt/3HhNLc1 United States 310232863 https://ift.tt/2VRgVvo
https://ift.tt/3oj64ou Indonesia 242968342 https://ift.tt/2VRgVvo
https://ift.tt/31Ylaby Brazil 201103330 https://ift.tt/2VRgVvo

I know I can cast the variable to get the correct result like so ORDER BY DESC(xsd:integer(?population)), but once my queries get more complicated, this no longer works. Specifically, when running sub queries and using the results to apply further logic. For example:

PREFIX  xsd:  <http://www.w3.org/2001/XMLSchema#>
PREFIX  gn:   <http://www.geonames.org/ontology#>

SELECT ?cityName ?countryName ?population datatype(?population)
WHERE
{
    ?city  gn:parentCountry ?country ;
           gn:population    ?population ;
           gn:name          ?cityName .

    ?country gn:name ?countryName .

    {
        # a) SELECT ?country (MAX(?population) AS ?population)
        # b) SELECT ?country (MAX(xsd:integer(?population)) AS ?population)     
        # c) SELECT ?country (xsd:string(MAX(xsd:integer(?population))) AS ?population)
        WHERE 
        {
            ?city   a                gn:Feature ;
                    gn:featureClass  <https://www.geonames.org/ontology#P> ;
                    gn:population    ?population ;
                    gn:parentCountry ?country .

        }
        GROUP BY ?country
        ORDER BY DESC(?population)
    }
}

Select a returns the populations in lexicographic order, as before.
Select b correctly orders the populations, but seeing as the result set has cast the population to integers, I can no longer match the city using population outside the sub query as I'm comparing strings with integers. So b returns an empty result set.
Select c was my attempt at recasting the results back to strings in order to be able to match them outside the sub query, but this ends in a timeout (estimated 4000 second execution time).

My question is this: Is there a way to either

a) change the datatype in Virtuoso manually
b) use the Geonames ontology to instruct Virtuoso about the correct types
c) alter my query to more efficiently cast to the correct type

I'm hoping option b is possible, as this seems the most effective solution, because the Geonames ontology correctly specifies the types to all of the resulting predicate's objects.

You can find the Geoname ontology here.
You can test the queries above and your own against our endpoint here: http://18.170.45.162:8890/sparql



from Recent Questions - Stack Overflow https://ift.tt/3Cc4F83
https://ift.tt/eA8V8J

Comments

Popular posts from this blog

Spring Elasticsearch Operations

Object oriented programming concepts (OOPs)

Network Error and Timeout on Authorize.net JS