craft
Article

The metadata you didn't know you were sending

AzizWares CTO· ·6

A friend sent a polished investor deck to a Riyadh family office last month. Clean design, sharp numbers, brand-perfect typography. Two days later, the family office's in-house engineer — out of habit, not malice — opened the .pptx in 7-Zip and read the XML inside.

Author: john.macbook-2022
Application: Canva
Last Modified By: Aziz B.
Producer (PDF export): Skia/PDF m130

The deck was supposed to come from a boutique strategy firm. The metadata told a different story: a freelancer in Canva, exporting through Chromium. The deal didn't die because of metadata — but trust took a real hit, and recovering it took weeks.

What's actually leaking

Every Office Open XML file (.pptx, .docx, .xlsx) is a ZIP archive with XML inside. Open docProps/core.xml in any text editor and you'll find:

  • Author — usually your laptop's user account, exactly as it was set when you first ran Office.
  • Last Modified By — whoever last clicked Save. Tells the receiver who really owns this file.
  • ApplicationMicrosoft PowerPoint if you're lucky, but more often LibreOffice, Canva, PptxGenJS, or Aspose. Each of these tells a story about your workflow.
  • Custom Properties — many tools quietly stash internal IDs, project codes, draft numbers.

PDFs are worse. The metadata lives in three places: the Info dictionary, the XMP packet, and incremental-update bytes that survive even after a "Save As." A PDF exported from a Chromium headless print stamps Producer: Skia/PDF into a place most PDF editors never touch.

Why this matters more in our region

Saudi and MENA agencies, freelancers, and consultancies routinely white-label work — that's the model. A senior agency hires a specialist; a specialist subcontracts a designer; the deliverable lands on the client's desk under the agency's brand. The work is real and the white-label is honest. But the metadata silently betrays the chain. A "boutique consultancy" deck whose Author is ahmed-laptop reads, to the wrong reader, like a deception.

It almost never is. It's just nobody cleaned the file.

Why the obvious fixes don't work

PowerPoint's File → Inspect Document covers core properties — but misses custom XML, comments, speaker notes you forgot were there, and embedded thumbnails. "Save a fresh copy" doesn't strip XMP packets in PDFs. Online "PDF metadata removers" upload your file to someone else's server, which is a worse problem than the one you started with.

To do this properly you have to: open the OOXML package, rewrite docProps/core.xml, scrub docProps/app.xml, walk the package for tool-fingerprint XML, lockstep-update [Content_Types].xml and _rels/.rels so nothing dangles, and for PDFs scrub the Info dictionary after the writer has finished — because most PDF libraries stamp Producer back in on write.

Nobody does this by hand. You wouldn't either.

The إتقان angle

Itqan — إتقان — is the Arabic word for craftsmanship that doesn't draw attention to itself. The senior delivery and the junior delivery look identical to the receiver. The difference is the things nobody asked about: the tabular alignment in the appendix, the right glyph for ﷻ, the cleaned metadata. Done well, the receiver feels nothing in particular. Done poorly, the receiver feels something is off — and they're never sure why.

What we built

We built GoHumanize originally to fix our own client deliverables. It's a small desktop app — Mac, Windows, Linux — that opens an OOXML or PDF, shows you exactly what's inside, and either replaces the metadata with clean business defaults or strips it entirely. It works locally, on your machine. No upload, no telemetry, no cloud. Free.

It is, frankly, the kind of tool we shouldn't have had to build. But here we are.

One actionable thing

Before you send the next client deck, run gohumanize inspect on it. You will find at least one thing you didn't know was there. Download GoHumanize →

Tags metadata privacy ooxml pdf gohumanize itqan