Free Citation Extractor — DOI, arXiv, PMID, ISBN, URL

Text Tools

What is Free Citation Extractor — DOI, arXiv, PMID, ISBN, URL?

Citation Extractor scans arbitrary text for the persistent identifiers researchers actually use — DOIs (10.xxxx/…), arXiv preprint IDs, PubMed IDs, ISBNs, and plain URLs — and returns a deduped, clickable list. It is the kind of thing you reach for after pasting in a messy references section, an email full of links, or the text layer from a PDF, when you just want every citable identifier on one screen.

How it works

Each identifier type has a regular expression tuned to its canonical format. DOIs match the 10.xxxx/… prefix. arXiv IDs match the modern YYMM.NNNNN format with optional version suffix. PMIDs require the PMID: prefix to avoid catching random digits. ISBNs match the 10- or 13-digit format with optional separators. URLs match http(s) schemes. Matches are normalised, deduped, and linked to the canonical resolver for each type.

Features & Benefits

Finds DOIs, arXiv IDs, PMIDs, ISBNs, and URLs in a single pass
Dedupes results so each identifier only appears once
Generates direct links to doi.org, arXiv, PubMed, and WorldCat
Exports as a plain list or a Markdown-formatted bibliography skeleton

Frequently Asked Questions

Does this resolve the citations to full metadata?

No. It only extracts identifiers — fetching titles or authors would require external API calls, and this tool stays fully client-side.

How does it dedupe?

Identifiers are compared case-insensitively and deduped per type, so the same DOI appearing five times in your text shows up once.

Why didn't it find my arXiv ID?

It expects the modern YYMM.NNNNN format. The older subject-class IDs (e.g., math.GT/0601001) aren't matched by the current pattern.

Is my text sent anywhere?

No. All matching happens in your browser.

Related Tools

BibTeX Formatter & Deduplicator

Parse, sort, dedupe, and reformat BibTeX bibliography entries entirely in your browser. Flags missing required fields.

Text Concordance — KWIC, n-grams, Frequency

Build a keyword-in-context concordance and unigram/bigram/trigram frequency tables for any text. Runs in your browser.

URL Encoder & Decoder

Encode and decode URL components instantly in your browser. Percent-encodes special characters. No data sent to servers.

Popular Utilities

JSON Formatter & Validator

Format, validate, and minify JSON instantly in your browser. Your data never leaves your device.

JWT Decoder

Decode JWT tokens and inspect header and payload instantly in your browser. Your tokens never leave your device.

Word Counter

Count words, characters, sentences, and estimate reading time instantly in your browser. No sign-up required.