Contributions are welcome. This document describes the expected workflow.
Reporting issues
Open a GitHub issue with:
- Framework and version involved (if relevant).
- Reproduction steps: which script was run, with what command, and what output appeared.
- Your R session info (
sessionInfo()). - If verification failed: the full output of
Rscript scripts/015-verify-ingestion.R.
Adding a new framework
See the vignette adding-a-framework.Rmd for the complete six-step process. Summary:
- Pick a slug and framework prefix.
- Stage the source file at
data/raw/<slug>/and writescripts/010-ingest-<slug>.R. - Declare invariants in
docs/framework-invariants.yml. - Add verification field mappings in
scripts/015-verify-ingestion.R. - Add a JSON-LD assembly adapter in
scripts/020-assemble-jsonld.R. - Verify, assemble, run queries.
Include a provenance.yml manifest and (if applicable) notes on extraction limitations.
Code style
- Native R pipes (
|>) and tidyverse idioms where reasonable. -
here::here()for paths. No hardcoded paths. -
suppressPackageStartupMessages({ library(...) })at the top of scripts. - Function names descriptive, not abbreviated. No
porkfor variables. - Spelling: package prose (vignettes, NEWS, roxygen comments) uses American English (
Language: en-USin DESCRIPTION). Framework names that originate with their publishers (e.g., “European Commission Joint Research Centre”) retain the publisher’s spelling. Rundevtools::spell_check()before opening a PR. Add new domain vocabulary toinst/WORDLISTrather than rewording. - Every ingestion script writes a
provenance.ymlwith SHA256, retrieval date, and licensing.
Tests
New functions need unit tests under tests/testthat/. Run tests with:
devtools::test()
# or
testthat::test_local()Documentation
- Every exported function needs roxygen docs with
@param,@return,@export. - Regenerate
NAMESPACEandman/withdevtools::document(). - Update
NEWS.mdfor user-facing changes.
Framework licensing
New frameworks must document their upstream license in data/raw/<slug>/provenance.yml. If the source license prohibits redistribution, the ingestion script should require user-supplied source files rather than auto-download. See SFIA and DCWF ingestion scripts for the manual-stage pattern.
