Efficient and Focused Web Crawling for Statistical Data Sources Retrieval
Democratic societies crucially depend on the possibility to establish a common view of (at least part of) reality. While the Web and modern means of communications exponentially increase the opportunities for public and private citizens to state opinions and express themselves, having the means to check how various statements stand with respect to existing reference data sources is becoming increasingly important. Verification has always been part of journalists’ work, as they analyze and comment on public statements. For efficiency, and to leverage the wealth of reference data sources available online, automatic or semi-automatic methods have been developed to help users engaging in verification. The StatCheck system building upon aims for, with a pipeline that works on both claim detection and statistical facts extraction, using data from reference sources such as INSEE and EuroStat to build its database of trustworthy facts.
- Authors: Antoine Gauquier, Ioana Manolescu and Pierre Senellart
- Article: