export your linkedin saved posts with selenium and beautiful soup
export your linkedin saved posts with selenium and beautiful soup
 export your linkedin saved posts with selenium and beautiful soup 
 export your linkedin saved posts with selenium and beautiful soup
we save things “to read later” and… rarely return. linkedin’s saved items helps, but curation inside the app can get messy. this walkthrough shows how to open your account, visit saved items, scroll the page, extract author, link, text, and date, and export everything to csv and json and your reading list becomes searchable and shareable data.
1
(same thing happens with ‘save to read latter’)
what it does
flowchart TD
    a["start"] --> b["go to saved items"]
    b --> c["extract: author/link/text/date_label"]
    c --> d["compute date_approx"]
    d --> e["ask: months back"]
    e --> f{"older?"}
    f -- "no" --> g["scroll more"]
    g --> f
    f -- "yes" --> h["stop"]
    h --> i["deduplicate"]
    i --> j["write csv"]
    j --> k["write json"]
    k --> l["done"]
assumptions and guardrails
- linkedin ui language is english (relative labels: 
mofor month,yrfor year). - csv uses utf-8 with bom so excel opens emojis and accents correctly.
 - the script tries several dom patterns to extract text/author across different post layouts.
 - the platform forbids scraping and automated activity that abuses the service and this walkthrough is for personal archiving of your own saved items list with a human logging in (one of the reasons why im using an ““manual”” mode for login and consent flows).
 - i suggest you to keep 2fa enabled on your linkedin account.
 - expect selectors to change over time.
 
installation and files
you need recent python 3 and these packages:
1
pip install selenium beautifulsoup4 pandas
- everything (script, requirements, notes, installation) lives in this folder:
 - click here)
 - selenium manager usually auto-installs the correct browser driver.
 - editor used: vs code.
 
output schema
| column | meaning | 
|---|---|
author | display name of the post author | 
link | canonical link to the post | 
text | main text that follows the post | 
date_label | relative ui label (e.g., 2mo, 1yr, 3w) | 
date_approx | approximate absolute date computed from date_label | 
extracted_on | date you ran the export | 
what now?
with the csv/json you choose your next step, the foundation is already laid and rest is curiosity!
enjoyed this post? get new ones by email:
.
 This post is licensed under  CC BY 4.0  by the author.
