Hugging Face Hub

The place the ML and research world looks for models and datasets — versioned, documented, and pulled into someone's code with one line. Put your model's weights or your dataset's files at a huggingface.co/... address, and the exact audience that wants them finds, cites, and builds on them. Choose:

Public — anyone finds and downloads it, no account to read.

Private — only people you add.

Gated — listed for all to find, but each person requests access and you let them in.

Reach for it when you're handing over a model or data others load straight into ML code. Skip it when it's ordinary project files people read and edit — a GitHub repo fits that; the Hub is built for big weight and data files plus the ML tooling around them.

Last verified: 2026-06-07 · Confidence: high on the public/private/gated model, the one-line pull, and the card.

It allows you to

Put it where the field already looks. The ML and research audience finds models and datasets by search and tag — not a link you push to them.
Let them pull it with one line. Anyone you allow loads the whole thing into their own code — snapshot_download("you/your-repo") — no ZIP, no "which file goes where".
Document it on a card. A README renders as the front page: what it is, how to load it, the licence (with a badge), how to cite.
Ship big files without fuss. Multi-gigabyte weights and data shards upload on the Hub's large-file backend — no extra setup. [confirmed]
Screen each downloader when it's sensitive. Set the repo gated and every requester hands you a name and email first. Details: Who can get in.

Ideal for

A fine-tuned safety classifier others evaluate — you release the weights gated, each lab requests access, and they pull it into their own eval harness. Like Meta's Llama Guard 3 — a content-safety classifier, request-to-access, 50k+ downloads a month.
A curated eval or benchmark dataset — rows others load with one line to score their own model, every version pinned so a citation points at exactly what you ran.
A forecasting or research dataset with a citable card — the card carries the licence, the source, and how to cite, so a paper can point at your repo id and reproduce from it.

Who can get in

You pick the audience at create time. Public, private, or gated — flip between them later in the repo's settings. [confirmed]
Gated is the standout. The repo stays findable, but downloads lock behind a request — each person clicks "agree", shares their username and email, and you auto-grant or approve by hand. Best for early research weights or a dual-use model you release deliberately. [confirmed]
- Gating hides the files, not the page — name, card, and metadata stay public. If even the existence is sensitive, use private instead. [confirmed]
Cut someone off. Revoke a granted user any time; a copy they already downloaded stays with them (true everywhere). [confirmed]

Which rungs it can hold. Just you / named people / the whole internet, plus gated (public-to-find, you approve each download) — no plain "anyone with the link" rung. → Who can see it? [confirmed]

Handing data to the host. Hugging Face holds your repo; a public one carries an open, irrevocable licence to every other user, and the docs are silent on whether they train on your uploads. → Can you trust the company? [unclear]

What you do to set it up

Ask: tell Claude Code "push this model/dataset to a Hugging Face repo and share it." It installs the library, creates the repo, drafts the card, and uploads — including the big files. Every share after: one sentence, ~0 effort.
The part you can't delegate: writing the card — what it's for, its limits, what not to use it on. Only you know that. ~15–30 min of writing. [estimate]
One-time, in order:
1. Set up Claude Code — the thing that does the rest, ~10 min once.
2. A free account at huggingface.co/join — email + password, ~3 min once. [confirmed]
3. A Write access token at settings/tokens — so your agent can push as you, ~2 min once (a Read token can't push). [confirmed]
Full walkthrough, gating, and the by-hand steps: Share a model or dataset on the Hub.

What the other person does

Pull the whole repo: one line in their own code — snapshot_download("you/your-repo") (add repo_type="dataset" for data). Files land cached, ready to load. ~10 sec to write, the rest is download time. [confirmed]
Or just download a file from the repo page in the browser — no code, no account, for a public repo. ~5 sec.
For a private or gated repo: they sign in once with their own free token (hf auth login), and — if gated — must have been granted access first. [confirmed]
Pay: nothing for public repos; large private storage is paid. → the fine print. [unclear]

It's project files people read and edit, not weights or rows? → a GitHub repository hands over the whole thing with every version tracked — built for code, not large model files.
People should see the model work, not load it? → a Hugging Face Space hosts a live, clickable demo (made the same way) — a heavier lift, so use it only when people need to try it. A Google Colab notebook is a lighter way to let someone run a demo end to end.

Sources

Good to know

Public is openly licensed to everyone, and the training question is unanswered — going private or gated is the only way to take a public repo back. [confirmed]
Repos sit in the US by default; EU storage is Team/Enterprise. Name it if a funder restricts where data may live. [confirmed]
Pricing / free-storage caps: re-check live at huggingface.co/pricing. [unclear]
The detail behind all three: Hugging Face Hub — the fine print.