Collapse Index™

> Collapse Index Datasets.

Data isn’t noise. It’s a story about stability under stress. Collapse Index Synthetic Datasets (CISD) are curated synthetic datasets that reveal collapse from small, benign changes. Standardized schema, sealed manifests, and SHA‑256 integrity made in‑house with CIDG, fully offline.
















🧪 CISD: Synthetic Datasets

> Stress-designed, audit-aligned.

Collapse Index Synthetic Datasets (CISD) are curated to expose instability under benign variation. They follow CI-compatible schemas and include sealed summaries for reproducibility and audit.


[Available in sizes from small samples to enterprise-scale sets.]


  • Schema

    Consistent, standardized columns.

    [id], [variant], [label], [prediction], [confidence] (+ optional fields) standardized for analysis and audit.

  • In‑house Generation

    Made with CIDG, offline.

    All datasets are created and computed with the Collapse Index Data Generator (CIDG) fully in‑house—no cloud calls.

  • Formats

    CSV and Parquet.

    Checksums provided for integrity; sealed summaries included for provenance.

  • Privacy

    Redacted or synthetic.

    No external dependencies; generated offline with bias-controlled distributions.




🔑 Key Features

> What makes CISD different.


  • 100% Synthetic, Offline

    No scraping. No real‑world data.

    Generated with deterministic scripts and controlled randomization—no API calls, pre‑trained models, or inherited corpora.

  • Legally Clean

    Publication‑safe and auditable.

    Bias‑controlled distributions, reproducibility manifests, and SHA‑256 provenance for academic‑grade verification.

  • Stress Tiers

    CP1 → CP6.

    Six custom-tuned standardized scenario tiers and consistent schema/labeling to benchmark collapse and brittleness across severities.

  • Sealed Bundles

    Ready to use.

    Each pack ships as a sealed zip with dataset.csv, README.md, MANIFEST.json, LICENSE.txt, embedded metadata, and a certificate of authenthicity.

  • Pack Defaults

    Six datasets / 1,000 base rows x 3 fixed variants each.

    CSV + manifest by default; larger or custom configurations available on request.

  • Integrity

    SHA‑256 verified.

    Offline verification commands included; generation and checks run fully offline.




🧷 Technical Specs

> What you receive.


Data

Six curated datasets

Docs

Manifest + README

Integrity

SHA‑256

Support

Email support




🚀 Try, Buy, or Scale

> How to get datasets.


  • Demo

    Preview on Hugging Face.

    Explore a sample dataset in your browser. Zero install. Coming soon.


    Open demo (coming soon) →
  • Purchase

    Buy curated packs.

    Secure storefront with instant download and integrity manifests.


    Buy on Gumroad (coming soon) →
  • Enterprise

    Up to 2.5M rows per set.

    Custom schemas, tailored perturbations, access to CI-tuned diagnostic and semantic packs, and private support.


    Contact →