This is my actual data

I downloaded my LinkedIn archive. Here's what was in it.

What LinkedIn Actually Keeps on You

My name is Jenny. I work in creative strategy and AI consulting. I downloaded my LinkedIn data archive on February 10, 2026. It contained 40+ files and 38,000+ rows of data — tracking everything from my DMs to my search history to whether I'm an "active contributor who influences public opinion." Here's what's inside. And what it means.

Scroll to explore
0
Files in the archive
0
Messages stored
0+
Ad targeting labels
0
Years of data tracked
01 — Identity & Profile

Everything that makes you, you

The basics: your name, your face, your work history, your skills. Standard stuff — until you notice they also keep your zip code, your Twitter handle, and your phone numbers in separate files.

csv Profile.csv
1 row
+

Your name, headline, summary, industry, zip code, geo location, Twitter handles, websites, and instant messenger IDs. All in one row.

First Name, Last Name, Maiden Name, Address, Birth Date, Headline, Summary, Industry, Zip Code, Geo Location, Twitter Handles, Websites, Instant Messengers
csv Positions.csv
~10 rows
+

Your complete employment history — company name, title, description, location, and start/end dates. Every role you've ever listed.

csv Email Addresses.csv
Stored separately
+

Every email address you've ever associated with your account, kept in its own dedicated file.

csv PhoneNumbers.csv
Stored separately
+

Your phone numbers — also in their own file, separate from your profile.

csv Education.csv  /  Skills.csv  /  Registration.csv
+

Your schools, self-listed skills, and the date you first registered. Skills includes things you added years ago and forgot about.

What this enables

Individually, these files are benign. Combined, they're a complete identity profile. Full name + zip code + phone number + email + employment history is everything needed for targeted phishing, impersonation, or social engineering. In a data breach, this isn't just an email and password — it's you.


02 — Your Social Graph

Everyone you know, and how

Not just your connections — everyone you follow, every company you follow, every hashtag you follow, and the contacts you imported from your phone or email.

csv Connections.csv
~2,665 rows Big
+

Every connection: name, profile URL, email address, company, position, and exact date connected.

Notes: "When exporting your connection data, you may notice that some of the email addresses are missing..." First Name, Last Name, URL, Email Address, Company, Position, Connected On
csv ImportedContacts.csv
From your phone
+

Remember when LinkedIn asked to "find people you know"? These are the contacts you uploaded from your phone or email. They kept them.

csv Invitations.csv  /  Member_Follows.csv  /  Company Follows.csv  /  Hashtag_Follows.csv
+

Every invitation (with your personal message), plus three separate files for three types of follows — people, companies, and hashtags. Each tracked individually.

My network — 2,664 connections mapped over time
57
'07
13
'08
26
'10
55
'11
57
'12
61
'13
39
'14
48
'15
76
'16
140
'17
118
'18
66
'19
88
'20
96
'21
160
'22
396
'23
373
'24
138
'25
25
'26
What LinkedIn knows about my network by role
Directors422
Founders331
Creative Directors310
C-Suite210
Managers149
Partners99
Heads of...93
CEOs92
Freelancers90
Consultants69
VPs63
Producers57
Designers52
Copywriters51
Strategists49
Owners49

Without revealing a single name, this data paints a clear picture: senior creative industry, heavy founder/C-suite network, big spike in 2023–2024 (career transition visible from space). LinkedIn knows your seniority level, your industry, your career trajectory, and exactly when your professional life changed — all from connection metadata.

What this enables

Your social graph is a relationship map. It reveals who you trust, who you work with, and who influences you. Recruiters pay LinkedIn to see this. Advertisers use it for lookalike targeting. And if this data leaks, it's a roadmap for spear-phishing — "Hey, your colleague Sarah mentioned you'd be interested in this..." LinkedIn also keeps the contacts you imported from your phone years ago, including people who never signed up for LinkedIn.


03 — Everything You've Ever Said or Done

A complete record of your activity

Every message. Every comment. Every post. Every reaction. Going back years. Over 27,000 rows of your words and actions.

csv messages.csv
12,468 rows Largest file
+

Every DM you've ever sent or received. Full message content, timestamps, conversation IDs, sender/recipient profile URLs, folder location, attachments, and whether it was a draft. Twelve thousand messages.

CONVERSATION ID, CONVERSATION TITLE, FROM, SENDER PROFILE URL, TO, RECIPIENT PROFILE URLS, DATE, SUBJECT, CONTENT, FOLDER, ATTACHMENTS, IS MESSAGE DRAFT
csv Comments.csv
7,420 rows Big
+

Every comment you've left on any post, with the date, a link to the post, and the full text of your comment.

csv Shares.csv
4,707 rows Big
+

Every post you've shared — date, link, your full commentary text, any shared URL, media URL, and visibility setting.

csv Reactions.csv
2,576 rows
+

Every reaction, categorized by type. Not just "like" — LinkedIn separately tracks LIKE, EMPATHY, PRAISE, ENTERTAINMENT, and APPRECIATION. Each is a data point about your emotional response to content.

Date, Type, Link 2026-02-09, EMPATHY, https://linkedin.com/feed/... 2026-02-08, APPRECIATION, https://linkedin.com/feed/... 2026-02-08, PRAISE, https://linkedin.com/feed/... 2026-02-05, ENTERTAINMENT, https://linkedin.com/feed/...
csv SearchQueries.csv
8,225 rows Years of data
+

Every search query, timestamped, going back years. The same search often appears 2–4 times with identical timestamps, suggesting LinkedIn logs each filter interaction as a separate event. Searching job titles? LinkedIn uses that to infer you're job hunting — and sells that signal to recruiters.

Time, Search Query 2024/02/01 19:09:27 UTC, creative strategist 2024/02/01 19:09:27 UTC, creative strategist 2024/02/01 19:09:27 UTC, head of customer experience 2024/02/01 19:09:27 UTC, head of customer experience
csv Votes.csv  /  InstantReposts.csv  /  Saved_Items.csv  /  Events.csv
+

Poll votes, reposts, saved items, and event interactions. Every micro-action, catalogued.

My emotional profile — built from 2,575 reactions
Praise
911
Like
849
Empathy
683
Entertainment
89
Interest
33
Appreciation
10
How my reactions changed over time
2017
2018
2019
2020
2021
2022
2023
2024
2025
Like
Praise
Empathy
Entertainment
Interest
Appreciation

Before 2019, "Like" was the only option — no emotional data. Once LinkedIn added reaction types, my pattern shifted: I lead with Praise and Empathy, rarely just "Like" anymore. LinkedIn can read this shift. They know I respond to people's struggles and celebrate their wins. That's not a feature preference. That's a personality profile.

What this enables

This is the algorithm. Your reaction types tell LinkedIn what emotions drive your engagement. Your searches reveal your intent — job titles you're curious about, people you're researching. Your comments and shares tell them what topics activate you. All of this feeds the ranking model that controls what shows up in your feed every day. It's also training data for AI — 12,000+ messages and 7,000+ comments of natural human conversation, with emotional labels attached, owned by the same company that owns a major stake in OpenAI.


04 — What LinkedIn Thinks About You

Your advertising dossier

This is where it gets real. LinkedIn maintains a massive profile of inferred attributes about you — and sells access to it. Here's what mine looks like.

csv Inferences_about_you.csv
4 rows Quiet but creepy
+

Just four rows, but each one is a judgment LinkedIn has made about you:

• HR professional? NO • Inferred gender: FEMALE • Interested in a new job? TRUE • "Active contributor who influences public opinion" TRUE

These are the categories advertisers can target. LinkedIn decided you "influence public opinion" based on "factors such as your experience, industry, and activity."

csv Ad_Targeting.csv
1 row, ~400+ labels The big one
+

A single row with hundreds of semicolon-delimited values crammed into each cell. Columns include: age range, gender, buyer groups, company names, company followers, company category, company size, degrees, schools, growth rate, fields of study, company connections, job functions, member gender, groups, industries, interests, locales, traits, locations, revenue, seniorities, skills, job titles, and years of experience.

csv Ads Clicked.csv  /  LAN Ads Engagement.csv
495 rows
+

Every ad you've clicked and your engagement with the LinkedIn Audience Network (ads on third-party sites that LinkedIn serves).

The Ad_Targeting file, visualized

These are the 400+ interest labels LinkedIn assigned to me — a creative strategist. Red tags are the ones that have nothing to do with my actual work or interests.

Content Marketing Mobile Advertising Online Media Market Segmentation Pay-Per-Click Marketing Metrics Strategic Management Advertising Strategies Health Management Consulting Sales and Retail Market Research Marketing and Advertising Legal Services Natural Language Processing Direct Marketing Computer Animation Business Administration Chatbots in Marketing Society and Culture Swarm Robotics Hypertargeting Business Consulting Social Media Marketing Salary and Wages Search Engine Marketing Robotics Entertainment Electronics Native Advertising Email Marketing Content Strategy Social Media Renewable Energy Neuromarketing Affinity Marketing Retargeting Business Analysis Computer Graphics Blogs and Blogging Human-Computer Interaction Marketing Technology Business Travel Design Software Technology Working Environments Science and Environment Design and Visual Arts Drones Special Effects TV and Radio IDEs Marketing Communications Journalism Internet Infrastructure Human Resources Distributed Computing Wearable Tech News Media Dark Social Business Technology Quantum Computing Computer Networks CRM Careers and Employment Engagement Marketing Computer Programming Brand Equity Consumer Electronics Chatbots Paid Social People Management Sales Channels Artificial Intelligence Interactive Content Marketing Automation Copywriting Arts and Entertainment Programmatic Marketing Video Games Media Planning Employee Benefits Mobile Technology Media Buying Helpdesk and Customer Support Out-of-Home Advertising Audio-Visual Production B2B Marketing Customer Experience Machine Learning Disruptive Innovation Computer Software E-Commerce Brand Personality Marketing Performance User Experience Digital Marketing Finance and Economy Financial Technology Workplace Wellness Product Placement SEO Employee Engagement Nanorobotics Marketing Research Brand Management Politics and Law Social Issues Virtual Reality Marketing Lead Generation Augmented Reality Baby Tech Professional Networking Brand Awareness Energy Deep Learning Targeted Advertising Influencer Marketing Electronics Commercial Sponsorship Financial Management Marketing Mix Marketing Strategies Swordmanship Baby Showers Active DoD Secret Clearance Auto Racing Trail Running Weapons Training Wedding Industry Personal Finance Travel and Tourism Performing Arts Liberal Arts Healthcare Advertising Government Contracting Multi-level Marketing Outdoor Recreation Housing Policy Electricity Markets Pick and Pack Movie Magic Sound Art Talk Radio Catalogs Flyers Investing Exterior Finishes Medical Social Work Adult Education Durable Goods Sales Vehicle Maintenance On-board Diagnostics Doing More with Less eBook Publishing Construction Management Film Production Magics Show Calling U.S. Department of Defense
Swordmanship. Baby Showers. DoD Secret Clearance.

I work in creative strategy and AI consulting. I have never expressed interest in Swordmanship, Baby Showers, Active DoD Secret Clearance, Nanorobotics, Weapons Training, or Pick and Pack. I didn't list these. LinkedIn inferred them. And these labels aren't just for ads — they determine what opportunities reach you. Job postings, recruiter outreach, and sponsored content are all filtered through this profile. The inferences are wildly inaccurate, which means the entire system is built on confident guesses. Every person who pays for a LinkedIn ad is bidding on these labels. Every person scrolling LinkedIn is being sorted by them.


05 — Your Behavior Patterns

The quiet files

These are the files most people would skip over. Login records, security events, endorsements, learning activity, payment receipts. The long tail of tracking.

csv Logins.csv
Every login
+

A timestamped record of every time you've logged in. Combined with IP data, this is a location history.

csv Security Challenges.csv
+

Every security challenge event — CAPTCHAs, two-factor prompts, suspicious login attempts.

csv Learning.csv 5 AI files
136 rows
+

Your LinkedIn Learning activity — every course viewed or completed. But notably, the export now includes four additional files for AI features: LearningCoachMessages.csv, learning_coach_messages.csv (yes, both exist), learning_role_play_messages.csv, and guide_messages.csv. These track your conversations with LinkedIn's AI coaching tools.

What this enables

Login timestamps + IP addresses = a location history LinkedIn never explicitly asked you to share. The AI coaching files are especially notable: LinkedIn now stores your conversations with their AI tools, which means your career anxieties, skill gaps, and professional insecurities — the things you'd only share with a "coach" — are now rows in a database.


06 — The Mess Behind the Curtain

Four date formats, zero consistency

The export itself tells a story: this data was clearly assembled by different teams with zero coordination. Here's what we mean.

Date formats across files

Connections.csv
08 Feb 2026
Shares.csv
2026-02-09 01:21:47
SearchQueries.csv
2024/02/01 15:02:55 UTC
Invitations.csv
1/29/26, 2:37 PM

File naming conventions (or lack thereof)

Snake_Case
Ad_Targeting.csv
Spaces
Ads Clicked.csv
camelCase
SearchQueries.csv
lowercase
messages.csv
LinkedIn gives you two export options. One hides the worst stuff.

When you request your data, LinkedIn offers a "Basic" export (ready in minutes) and a "Complete" export (takes ~24 hours). The Basic version contains about half the files. Guess which half is missing? Your search history. Your reactions. Your login records. The inferences they've made about you. The ad targeting labels. Your comments. Your shares. All the AI coaching messages. Everything on this page that made you uncomfortable is only in the Complete archive. If you've downloaded your data before and thought "this doesn't seem so bad" — you probably got the Basic.


07 — Why This Matters Right Now

This isn't theoretical

Three things happened in the last few months that make this archive more than an interesting exercise.

4.3 billion records were found in an exposed database

Nov 2025
4,300,000,000

Cybersecurity researchers discovered an unsecured MongoDB database containing 16 terabytes of professional records scraped from LinkedIn profiles. Names, emails, phone numbers, job roles, work history, education, skills, photos. The data sat exposed on the open internet until researchers flagged it.

The connection to your archive: The exposed data matches the exact fields in your LinkedIn export — Profile.csv, Positions.csv, Skills.csv, Education.csv, Email Addresses.csv. The difference is that your archive is one person. That breach was nearly everyone.

LinkedIn started using your data to train AI — and opted you in by default

Nov 2025

As of November 2025, LinkedIn updated its privacy policy to use member data to train generative AI models by default. Your posts, comments, reactions, profile information, and feed interactions now feed into Microsoft's AI ecosystem. You can opt out — but anything collected before your opt-out date is already in the training data. And most users were never notified.

The connection to your archive: Those 12,468 messages, 7,420 comments, 4,707 posts, and 2,575 emotionally-labeled reactions? That's not just stored data. It's training data — text with sentiment labels attached, exactly what AI models need to learn how humans communicate. LinkedIn owns it. Microsoft builds the models.

Women tested LinkedIn for gender bias. LinkedIn said it doesn't use gender. Their own export says otherwise.

Nov–Dec 2025

In the #WearThePants experiment, women changed their LinkedIn gender settings to male and saw dramatic increases in reach — 200% to 818% more impressions. LinkedIn's head of responsible AI stated that their systems "do not use demographic information (such as age, race, or gender) as a signal to determine the visibility of content."

The connection to your archive: Open Inferences_about_you.csv. Right there in row 3: "Inferred gender: FEMALE." LinkedIn collects it. They categorize it. They make it available to advertisers. They say the algorithm doesn't use it. But they definitely have it — and anyone who downloads their Complete archive can see that for themselves.
One last thing

This is one platform.

My LinkedIn Profile.csv contains my Twitter handle, my personal website, and my email address. My Connections.csv includes emails for 116 people in my network. My imported contacts have personal phone numbers and emails going back to 2007. These are cross-platform identifiers — the exact keys a data broker needs to link a LinkedIn profile to Facebook activity, Google searches, Instagram interests, Amazon purchases, and location history.

Every platform has an archive like this one. Most people have never downloaded any of them. Each archive alone is revealing. Linked together, they're a complete portrait of who you are, what you want, what you fear, and what you'll do next.

LinkedIn is just the one that's supposed to be "professional."

See your own file

LinkedIn is required to give you a copy of your data. But you have to ask for the right one.

Download Your Archive
1

Go to Settings → Data Privacy → Get a copy of your data

2

Select "Download larger data archive" — not the default. The default "Basic" export hides the most revealing files (search history, ad targeting, reactions, logins, inferences).

3

Wait ~24 hours. The Complete archive takes longer because there's a lot more in it.

4

Unzip it. Open the CSVs. See what they know.

5

While you're there: Settings → Data Privacy → Data for Generative AI Improvement → toggle it off. Anything before your opt-out date is already in the training data, but you can stop the bleeding.