Questions & Answers

What you’re
probably
wondering.

Ask Us Directly

We get a consistent set of hard questions — from investors, journalists, artists, enterprises, and researchers. Here are the honest answers.

Filter

Content provenance is the complete record of a piece of content's origin, creation history, and chain of custody — who made it, when, with what tools, and how it has been modified or transmitted since.

  • Where did this come from? (Detection direction — tracing unknown content backward to its source)
  • Is this still what it was? (Protection direction — sealing authentic content forward against misuse)

Standards like C2PA are building the technical foundation for provenance at scale. We are built to operate within and extend that ecosystem.

Two directions of the same problem:

  • Detection (Reveal) extracts backwards provenance: where does this content come from? synthetic or authentic, which model, which prompt, which source, which camera, etc.
  • Protection (Seal) secures forward provenance: how to guard this content to where it goes? your art, your face, your voice; both the content and its provenance remaining untouchable.

Together they close the loop: you can trace what you don't trust, and protect what you do.

FakeCatcher is one of the first deepfake detectors that uncover what is real in humans, like their heartbeats, using photoplethysmography (PPG) — the biological signal of blood flow captured by subtle color changes in skin pixels on video. Later, it powered the world's first real-time deepfake detection system.

It inspects what is fundamentally absent in synthetic faces: the coherent; spatially, temporally, and spectrally consistent heartbeat that permeates every authentic human face on video.

FakeCatcher is the foundational detection technology — the PPG-based biological deepfake detector, which powered several platforms including the world's first real-time deepfake detection system. It was built by Dr. Ilke Demir and her colleagues.

Cauth AI offers the full-provenance platform, built on top of several foundational technologies for both detection and protection. We love FakeCatcher, but the synthetic world is too complex to depend on only one detector.

All publications are accessible and frequently updated on our research page, indexed across deepfake detection, adversarial protection, multi-modal verification, and provenance attribution.

View the full research index →

Complete platform is available via our UI access and the two main entry points are available via API access.

For pilot access, documentation, or a technical walkthrough with our engineering team, get in touch.

Our platform covers the full synthetic content surface as exemplified below:

  • Human-centric video, image, and audio: Deepfakes, face reenactment, GAN/diffusion-generated portraits, voice cloning, singing voice conversion, text-to-X outputs.
  • AI generated and manipulated general content: Synthetic ads, scene manipulation, copyrighted content editing, satellite imagery deepfakes, fake object placements.
  • Hybrid content: For temporal data (video, audio), partial manipulations and generations.
  • Complex data: Multi-person videos, multi-speaker audio, low resolution, high compression.

Our current platform is focused on visual and audio modalities.

Text detection is a different problem: the field is dominated by statistical approaches that are brittle against paraphrasing, fine-tuning, and model updates. We have chosen depth over breadth — our detection methods are peer-reviewed and adversarially robust in the modalities we cover — rather than offering a generic text detector with the limitations it carries.

If text detection is critical to your use case, we are happy to discuss what hybrid approaches might look like in the context of a broader deployment. Get in touch.

We have layered defenses that make targeted attacks extremely difficult in practice.

  • Pipeline guardrails: Input purification, model blending, and rate-limiting prevent bad actors from training adversarial examples against our APIs in a black-box manner.
  • Immunity by design: To fool our detectors, a generator would need our signal extractors to be differentiable (it is not), would need to mimic both spatial and temporal consistency of real signals, and would need large ground-truth biological datasets to even approximate it.
  • Grounded reasoning layer: Even if a detector is individually tricked, our results are reported alongside human-interpretable evidence. Nonsensical evidence chains are visible — a user would simply reject the output, not act on it.

We aim to be "rightfully incorrect" if our results fail. Transparent enough that misuse of our errors is clearly identifiable.

Our published benchmarks are independently peer-reviewed, always updated against new threats, and continuously battle-tested. Key results from our papers and studies at the time of publishing:

97.29%, 95.12%, 99.27%, 99.50%

Deepfake detection using PPG signals, motion representations, eye/gaze features, and on satellite imagery.

93.39%, 97.77%

Source detection using PPG signals and motion representations.

+15.7%, +20.9%

Increase in human detection accuracy when paired with a detector companion, in general and high confidence predictions.

Human-based detectors require visible skin, muscles, or eyes — it is a signal that lives in pixels needed for reliable spatiotemporal maps.

However, the redundancy of having several interpretable and complementary detectors self-explains hard cases. Where multiple signals converge, confidence is high. Where they diverge or are absent, the user knows why as we surface uncertainty, limitation, and evidence.

Benchmarks are a floor, not a ceiling. The more meaningful test is field deployment under unknown conditions — and we have that record.

Cauth AI is continuously deployed when AI-generated content overwhelmed the information environment at scale — fake missile strikes, fabricated photojournalism, weaponized forensic heatmaps, fake politicians, manipulated phone conversations, ...

These are not controlled evaluations. They are deadline-pressure deployments where a wrong call has consequences. The stories are public and linked from our news section.

Not the way we play it. The "cat-and-mouse" framing applies to detectors that chase artifacts — frequency anomalies, boundary glitches, pixel-level tells. Generators do fix those over time. That's a real problem for artifact-based approaches.

Our detectors are built on biological, natural, and physical signals — heartbeat patterns, micro-motion physics, material consistency — that generators are not optimizing for. Generators are pursuing beauty, smoothness, and symmetry. Paradoxically, as generators become more perfectionistic, they move further from the messy realism of natural signals.

Lastly, detection platforms are orders of magnitude fewer than generation platforms. We are behind on public adoption — but that is a deployment problem, not a technical one.

Our platform covers the full threat surface that your owned content can be stolen:

  • Digital art, photography, illustrations, visual content.
  • Speech, vocal tracks in songs, voice-overs, audio content.
  • Faces and bodies, in social media, content platforms, surveillance footage, training content, ads, and more.

According to our published metrics (RMSE, PSNR, SSIM, STOI, PESQ, SI-SDR, MOS), quantitatively no. According to our artist study (see the paper), aesthetically no.

Our adversarial generators operate below the threshold of human perception. They are not visible watermarks, quality degradations, or compression artifacts.

Protected content can be published, shared, and distributed exactly as the unprotected original would be. The protection travels with the file.

We ran a formal study with 102 professional artists from diverse backgrounds and disciplines — not a convenience sample, not a self-selected group.

  • 98% saw no significant difference between original and protected versions, in high fidelity setting.
  • 96% explicitly said they need our complete controllable shield to protect their content.

That's not a feature request. That's a signal of unmet demand. The creative community has been largely unserved by tools that are either too slow, too narrow, or too visually destructive.

Existing tools are designed for single-purpose text-to-image tasks — protecting a specific style, category, or text against a specific model. Effective in that narrow lane, but brittle outside it.

Our protectors are designed to degrade output quality across any diffusion-based task for both training-time and inference-time attacks. Key advantages:

  • Artist control: grants direct control over the fidelity/protection tradeoff. Same protection, your rules.
  • Task-agnostic: works across all AI misuse scenarios (style transfer, inpainting, editing, deepfakes)
  • Model-agnostic: attacks any diffusion pipeline, not just text-to-image
  • Ultra fast: as opposed to hours-long iterative optimizations per image; our approach is a single feed-forward pass, roughly 600× faster. Artists can re-seal in seconds as new models or purifiers emerge.
  • Validation: matched Glaze's protection while showing 4% higher willingness in high-fidelity setting

We tested robustness against current state-of-the-art adversarial purifier. The results show a two-way failure for the attacker:

  • Distortion: To have any effect, purifier degrades the original image — PSNR drops from 29.5 dB to 21 dB. The "cleaned" image is visibly damaged.
  • Incomplete: Even after that damage, the protection only improves by +2 dB on the generation output — still far below usability thresholds.

While stronger purification methods will inevitably emerge, our speed advantage means artists simply re-seal their work in seconds whenever a new purifier or model version releases.

It is a real concern, and "security by obscurity" is not the answer. Three ways we think about it:

  • Technical risk is manageable. Input purification, model blending, and rate-limiting prevent systematic adversarial training against our APIs. We cannot eliminate this risk entirely, but we can reduce it to a level comparable to any other security-critical API.
  • Social risk is the harder problem. A false positive could let a politician deny a real harmful speech. A false negative could let an influencer fake an endorsement. The result from our platform makes us accountable — so we design our outputs to be auditable and human-interpretable, not just a score.
  • The asymmetry is the argument. Generation platforms are releasing the newest models without meaningful safety afterthought. There are orders of magnitude more generation APIs than detection APIs. If this is an arms race, detection should be available and continually funded — otherwise we concede the field.

Our ingestion pipeline includes several defense layers that collectively prevent black-box adversarial training:

  • Input purification: Submitted content is preprocessed to remove or neutralize adversarial attacks before inference.
  • Model blending: No single model architecture is exposed — outputs are derived from blended ensembles, making gradient-based attacks ineffective.
  • Rate-limiting & behavioral analysis: Systematic probing patterns are detected and blocked before a meaningful training signal can be extracted.
  • Structural immunity: When signal extraction is non-differentiable, no gradient can pass through it. An adversary literally cannot include it in a loss function.

Our algorithms, evaluations, and benchmarks are continuously published in top AI, CV, and ML venues.

Under the Daubert standard — the framework US federal courts use to evaluate the admissibility of scientific expert testimony — evidence must be grounded in methods that are: testable, peer-reviewed and published, associated with known error rates, and generally accepted within the relevant scientific community. Our core methods satisfies all four criteria. No commercial deepfake detector we are aware of can point to a foundational method published at this level.

In practice, this means that when our results are submitted as evidence — with timestamped outputs, signal-level breakdowns, and documented confidence scores — they are defensible under cross-examination in a way that black-box ensemble detectors are not. For high-stakes forensic cases, contact us directly.

False positives are the most serious failure mode in detection. A wrongful flag of synthetic content can harm reputations, fuel misinformation, and erode trust in detection technology as a whole.

Our approach is to make every result explainable and grounded rather than issuing a binary "fake/real" verdict. When we flag content, we surface the specific signals that drove the decision — timestamped, spatially localized, and human-readable. If a result cannot be backed by coherent evidence, we do not report it.

This doesn't eliminate false positives — no system can — but it means that when we are wrong, the error is visibly and auditably wrong, not a black-box decision that cannot be challenged.

We provide a genuine improvement to how content authenticity works — and we are intentional about not building an addictive product that creates dependency without underlying value. We prefer partners and investors aligned with that higher motivation rather than those seeking to hyperscale without regard for impact.

That said: vitamins can be profitable, especially when regulation mandates taking them. The EU AI Act, emerging US state-level deepfake legislations, and global policy provisions are all moving toward mandating "taking vitamins".

There is also a third path emerging: Low-quality, mass-generated content is triggering a market backlash. Tools that can prove authenticity, or even clean provenance chains, are beginning to look like they could become the good kind of morphine: something users reach for every time they encounter content they need to trust.

The field of provenance is structurally underfunded relative to generation. There are orders of magnitude more public generation platforms than detection platforms — and generation platforms are releasing new model versions continuously with minimal safety investment.

Most competitors offer one of: (1) a single-modality detector with no explainability, (2) a watermarking approach that only works on content the same organization generated, or (3) a forensics or cloaking tool built for researchers, not enterprise workflows.

Cauth AI is the only platform combining detection and provenance in a single stack — backed by peer-review and a founding team that invented key technologies in the field.

Our platform is designed for any organization where content trust is operationally critical. Media, entertainment, news, journalism, banks, creative industry, studios, platforms, online communities, dating, gaming, intelligence, government, defense, finance, insurance, marketplace, legal, policy, compliance... If there is online content, there is need.

View the full solution index →