Documentation
Programmatic access to Palladia benchmarks
Overview
The following docs are a straightforward way to get to know how to access the demo data used in this website. The Palladia GitHub repository provides free and unlimited access to benchmark results. All responses are JSON encoded.
Base URL
https://raw.githubusercontent.com/Dassoo/Palladia/refs/heads/main/benchmarks
Usage
JavaScript
const category = 'EarlyModernLatin';
const document = '1564-Thucydides-Valla';
const filename = '00363.bin';
const url = `https://raw.githubusercontent.com/Dassoo/Palladia/refs/heads/main/benchmarks/GT4HistOCR/corpus/${category}/${document}/${filename}.json`;
fetch(url)
.then(response => response.json())
.then(data => {
console.log('OCR Data:', data);
})
.catch(error => console.error('Error:', error));Python
import requests
category = 'EarlyModernLatin'
document = '1564-Thucydides-Valla'
filename = '00363.bin'
url = f'https://raw.githubusercontent.com/Dassoo/Palladia/refs/heads/main/benchmarks/GT4HistOCR/corpus/{'{'}category{'}'}/{'{'}document{'}'}/{'{'}filename{'}'}.json'
response = requests.get(url)
if response.status_code == 200:
data = response.json()
print(data)
else:
print(f'Error: {response.status_code}')cURL
curl \
-H "Accept: application/json" \
https://raw.githubusercontent.com/Dassoo/Palladia/refs/heads/main/benchmarks/GT4HistOCR/corpus/EarlyModernLatin/1564-Thucydides-Valla/00363.bin.jsonEndpoints
GET
/manifest.jsonRetrieve the list of all benchmarked documents.
GET
/GT4HistOCR/corpus/{corpus_name}/{document_name}/_summary.jsonRetrieve detailed document information.
GET
/GT4HistOCR/corpus/{corpus_name}/{document_name}/{benchmark_file}.jsonRetrieve specific file benchmarks. Replace ".json" with ".png" to retrieve the source image instead.