Hugging Face
The host behind a Hugging Face Hub share. Hugging Face holds your repo — model weights, dataset files, or a Space — plus the usual account details (name, email, billing, usage). The knob that decides everything here is set when you create the repo: public repos are visible to everyone and carry an open license to every other user; private repos are invite-only and confidential; gated repos are public-to-find but make each requester hand you their name and email before they can download. So exposure is a creation-time decision, not a buried setting.
Last verified: 2026-06-07 · Confidence: high on the public/private/gated mechanics, the US-default + EU-region-on-Team/Enterprise split, GDPR/SOC2/DPA (all from Hugging Face's own docs and ToS); the training-on-your-content question is the soft spot — Hugging Face's privacy policy and ToS are silent on whether your uploads feed model training, so the "they don't" read is reasoned from the absence of any training license, not quoted.
What it holds, and who can see it
Hugging Face is a Git host for ML, so a repo can be code, multi-gigabyte model weights, a dataset, or a running Space. One choice at creation sets who reaches it: [confirmed]
- Public — anyone can find, view, and copy it. Setting a repo public grants every other user "a perpetual, irrevocable, worldwide, royalty-free, non-exclusive license to use, display, publish, reproduce, distribute, and make derivative works of your Content." It's the share-by-default state and it's genuinely open.
[confirmed] - Private — invite-only and confidential. A private repo doesn't show in search, returns
404 - Repo not foundto outsiders, and can't be cloned by them; Hugging Face commits to "reasonable and appropriate measures designed to keep your Content confidential."[confirmed] - Gated — public to find, but you screen each downloader. A gated repo is reachable but locked: users "must agree to share their contact information (username and email address)" before downloading, and you can switch to manual approval to accept or reject each request (plus collect extra fields like company or country).
[confirmed]
You retain ownership throughout — "You own the Content you create!" — and you grant Hugging Face only the license it needs "to provide Services." [confirmed]
One honest line: visibility is per-repo. If something carries data you wouldn't hand a stranger, make the repo private before you push the files, not after.
Does it train AI on your content?
This is the one to read carefully, because the docs are quieter than at some hosts: [unclear]
- No published clause says they train on your repos, and no toggle exists to switch off — unlike GitHub's Copilot setting or Replit's public-App training clause, Hugging Face's privacy policy and Terms simply don't address using your uploaded models or datasets to train models. The only related language is that they "may aggregate, anonymize, or otherwise learn from data relating to your use of the Services" — i.e. usage telemetry, not your file contents. The fair read is that they don't train on your content, but it's an absence-of-claim, not a promise.
[unclear] - Public is still open to everyone else. Even if Hugging Face itself doesn't train on it, a public repo is openly licensed (above) — anyone, including other AI labs, can legally download and train on it. Going private or gated is the only off-switch for that.
[confirmed] - An open dataset already on the Hub is fair game for training by design — that's what the platform is for; the privacy question is only about your private uploads.
[estimate]
If the training question is load-bearing for you (a grant or partner forbids it), the clean answer is the Enterprise DPA below, which puts the no-training commitment in writing rather than leaving it to silence.
Keeping and deleting your data
- While your account is live, they hold it — that's the point of a host, and retention runs "as long as necessary to deliver the Services" plus legal/security needs. No fixed window is published for live content.
[confirmed] - Delete a repo or your account from your profile. "You may decide to cancel your account and your content at any time by editing your profile." The docs don't promise a recovery window or state that deletion is irreversibly purged on a timer — so assume it's final and keep your own copy first.
[unclear] - Anything already downloaded is gone from your reach. A public, openly-licensed repo that someone has cloned or mirrored survives your deletion — the open license is "irrevocable," so you can't claw copies back.
[confirmed]
What a Team / Enterprise plan changes
For an individual sharing a model card or a public dataset, the free tier is genuinely fine — the choice that matters is public-vs-private-vs-gated. The paid tiers add the paperwork and controls a compliance review asks for: [confirmed]
- A real data agreement. Hugging Face "can also offer Business Associate Addendums or GDPR data processing agreements through an Enterprise Plan" — the DPA is what carries an explicit no-training commitment and processor terms.
[confirmed] - Admin controls — SSO, audit logs, Resource Groups (fine-grained access control), org-wide token policies (require approval / fine-grained-only), and protected Spaces.
[confirmed] - Region pinning (next section) and SOC 2 Type 2 assurance covering the whole platform.
[confirmed]
Where your data lives (matters under GDPR)
- US by default. "For non-Team or Enterprise users, repositories are always stored in the US." Under GDPR that's a US transfer — usually fine, but name it if a grant or DPA restricts where data may sit.
[confirmed] - EU storage is a Team/Enterprise feature. Those plans get a Regions settings page to pin "models, datasets and Spaces" to US or EU (Asia-Pacific "coming soon"); non-default repos even show a region tag.
[confirmed] - The company is grounded in both. Hugging Face is US-based ("The Company and its servers are located in the United States") but names "Hugging Face, SAS" in France as its EU establishment under CNIL — so a DPA with EU Standard Contractual Clauses is the transfer mechanism on the Enterprise path.
[confirmed] - Niche but handy: a gated repo can block EU downloaders specifically with
extra_gated_eu_disallowed: trueif a model's license forbids EU distribution.[confirmed]
The short version: fine for a public model, an open dataset, or an internal private repo, even with EU/UK collaborators in the everyday case. If a funder or DPA forbids personal data leaving the EU, that's a Team/Enterprise conversation — there's no EU region on the free tier, and no UK-specific region as of 2026-06.
Sources
- Hugging Face Terms of Service — you own your Content; public repos grant an irrevocable open license to all users; private repos kept confidential; "aggregate, anonymize, or otherwise learn from data relating to your use."
- Privacy Policy — account data collected, retention "as long as necessary," cancel account by editing profile, US servers, France/CNIL EU establishment.
- Repository settings — visibility chosen at creation, private = not searchable /
404/ not cloneable, protected Spaces on PRO/Team/Enterprise. - Gated models — access requests share username + email, automatic vs manual approval, extra fields, EU-disallow flag, download access report.
- Security — GDPR compliant, SOC 2 Type 2, BAA/DPA via Enterprise, private repos, MFA, malware scanning.
- User access tokens — read/write/fine-grained roles, one token per app, org token-approval policies on Team/Enterprise.
- Storage Regions on the Hub — US default, EU region (APAC coming) on Team & Enterprise only, region tags.
- Hugging Face Data Processing Addendum — DPA / SCC mechanism for Enterprise (PDF; text not cleanly machine-readable, cited for existence) (unofficial parse — CDN PDF, seen 2026-06-07).