Post

export your linkedin saved posts with selenium and beautiful soup

export your linkedin saved posts with selenium and beautiful soup

export your linkedin saved posts with selenium and beautiful soup

export your linkedin saved posts with selenium and beautiful soup

we save things “to read later” and… rarely return. linkedin’s saved items helps, but curation inside the app can get messy. this walkthrough shows how to open your account, visit saved items, scroll the page, extract author, link, text, and date, and export everything to csv and json and your reading list becomes searchable and shareable data.

1
(same thing happens with ‘save to read latter’)

what it does

flowchart TD
    a["start"] --> b["go to saved items"]
    b --> c["extract: author/link/text/date_label"]
    c --> d["compute date_approx"]
    d --> e["ask: months back"]
    e --> f{"older?"}
    f -- "no" --> g["scroll more"]
    g --> f
    f -- "yes" --> h["stop"]
    h --> i["deduplicate"]
    i --> j["write csv"]
    j --> k["write json"]
    k --> l["done"]

assumptions and guardrails

  • linkedin ui language is english (relative labels: mo for month, yr for year).
  • csv uses utf-8 with bom so excel opens emojis and accents correctly.
  • the script tries several dom patterns to extract text/author across different post layouts.
  • the platform forbids scraping and automated activity that abuses the service and this walkthrough is for personal archiving of your own saved items list with a human logging in (one of the reasons why im using an ““manual”” mode for login and consent flows).
  • i suggest you to keep 2fa enabled on your linkedin account.
  • expect selectors to change over time.

installation and files

you need recent python 3 and these packages:

1
pip install selenium beautifulsoup4 pandas
  • everything (script, requirements, notes, installation) lives in this folder:
  • click here)
  • selenium manager usually auto-installs the correct browser driver.
  • editor used: vs code.

output schema

columnmeaning
authordisplay name of the post author
linkcanonical link to the post
textmain text that follows the post
date_labelrelative ui label (e.g., 2mo, 1yr, 3w)
date_approxapproximate absolute date computed from date_label
extracted_ondate you ran the export

what now?

with the csv/json you choose your next step, the foundation is already laid and rest is curiosity!

enjoyed this post? get new ones by email:

.

get new posts by email

no spam. unsubscribe anytime.

powered by buttondown

This post is licensed under CC BY 4.0 by the author.