← Index
SAVOR ECHOES · Cultural Heritage Cloud
Stage 1 Proposal · Bilingual recipe corpus

Three thousand recipes,
two languages,
one open corpus.

Project
SAVOR
Languages
Română ◇ Italiano
Corpus
3,300 recipes · ~180k triples
Delivery
M12 · Zenodo DOI
Romanian Academy · RACAI · USAMV · Casa Artusi 2026 · v1.0
SAVOR Part one
The corpus
01

From manuscript
to machine.

SAVOR / Tavola 02 / 10
SAVOR 1.1 — Audiences
Who SAVOR is for

Three audiences,
one shared corpus.

SAVOR is designed to be legible at three altitudes — to the curator who works folio by folio, to the researcher who works at scale, and to the cook who simply wants to know what's for dinner.

  • Digital humanistsA clean, validated corpus structured against schema.org/Recipe and FOODon. Provenance preserved per record.
  • NLP researchersMultilingual LaBSE embeddings, FAISS index, REST + SPARQL surfaces for cross-lingual retrieval.
  • The curiousSunday-lunch dishes from Cluj alongside Sunday-lunch dishes from Romagna. What changes, what doesn't.
SAVOR / Tavola 03 / 10
SAVOR 1.2 — A recipe in two languages
RO · Romana
Plăcintă cu varză și ciuperci
Cluj-Napoca · 1934 · ms.b3.f47
IT · Italiano
Sfoglia ripiena di cavolo e funghi
Translated via FOODon + LaBSE alignment
a fi tinută la rece
o oră înainte — bunica
SAVOR / Tavola 04 / 10
SAVOR 1.3 — By the numbers
The corpus, at a glance

3,300 recipes,
three sources, two languages.

2,500
Romania
2,000 interwar manuscript recipes + 500 contemporary student recipes from USAMV Cluj.
800
Italia
Pellegrino Artusi's 1891 cookbook, digitised in XML by Casa Artusi. Primarily Emilia-Romagna.
181k
RDF triples
Linked-data graph in Turtle, ingested into the ECHOES Cultural Heritage Cloud.
SAVOR / Tavola 05 / 10
SAVOR 1.4 — On the value of structured heritage
A bilingual recipe corpus is not a database of dishes — it is the kitchen as archive, the archive as conversation.
Prof. Massimo Montanari Food historian, Casa Artusi · formerly University of Bologna
SAVOR / Tavola 06 / 10
SAVOR 2.0 — Methodology
From manuscript to Zenodo

Six stages, twelve months,
three quality gates.

01
Source consolidation
XML, CSV, OCR text harmonised in UTF-8 with multilingual titling.
M1 — M2
02
Schema & validation
SAVOR-JSON extends schema.org/Recipe. CI on every record.
M1 — M4
03
Semantic annotation
spaCy + FOODon + Wikidata. GeoNames + PeriodO for context.
M3 — M7
04
AI-readiness
LaBSE embeddings, FAISS, REST API, Docker container, notebooks.
M4 — M10
05
Quality assurance
Schema validation, linguistic proofreading, expert content review.
ongoing
06
Cloud ingestion
RDF/Turtle, BagIt, ECHOES PIDs, OAI-PMH, Zenodo DOI.
M8 — M12
SAVOR / Tavola 07 / 10
SAVOR / Lab 3.1 — SAVOR-JSON · the schema
Lab register · the dataset

Every recipe
is a record.

SAVOR-JSON extends schema.org/Recipe with culinary-heritage-specific fields. The metadata spine is Qualified Dublin Core for ECHOES Cloud interoperability; recipe detail is carried in schema.org and FOODon. Every record validates before ingestion.

// /recipes/ro/cluj/1934/placinta-varza-ciuperci.json
{
  "@type": "Recipe",
  "recipeId": "sav:ro/cluj/1934/placinta-varza-ciuperci",
  "name": {
    "ro": "Plăcintă cu varză și ciuperci",
    "it": "Sfoglia ripiena di cavolo e funghi"
  },
  "region": "geonames:665087",
  "period": "periodo:p0qhvzd",
  "festive": ["wd:Q3406641"],
  "diet": ["vegan", "de-post"],
  "license": "CC-BY-SA-4.0",
  "embedding": "<float[768], LaBSE>"
}
SAVOR / Lab 08 / 10
SAVOR / Lab 3.2 — Architecture
Pipeline

Source → Schema → Semantic → Surface.

01 / SOURCE
Raw text
  • XML · Casa Artusi
  • CSV · USAMV student
  • OCR · RO archives
02 / SCHEMA
SAVOR-JSON
  • schema.org/Recipe
  • Dublin Core spine
  • CI validation
03 / SEMANTIC
Annotation
  • FOODon · Wikidata
  • GeoNames · PeriodO
  • FAO · nutrition
04 / SURFACE
Access layer
  • REST · /v1/converse
  • SPARQL endpoint
  • FAISS · LaBSE
SAVOR / Lab 09 / 10
SAVOR Thank you
SAVOR · 2026

A bilingual
corpus, made open.

RO · Coordinator
Romanian Academy
Cluj-Napoca
RO
RACAI
"Mihai Drăgănescu"
RO
USAMV
Cluj-Napoca
IT
Casa Artusi
Forlimpopoli
savor.eu github.com/savor-corpus doi:10.5281/zenodo.8729471
ECHOES · Cultural Heritage Cloud 10 / 10