LinkedIn Profile Scraper in Python

Scrape public LinkedIn profile data using Selenium and BeautifulSoup in Python. Covers automated login, profile extraction, and exporting structured results.

Sep 6, 2020Updated May 21, 202629 min readFollow

Topics You Will Master

Automating LinkedIn login with Selenium WebDriver
Extracting profile fields: name, headline, experience, and skills
Parsing HTML with BeautifulSoup for structured data extraction
Saving scraped profile data to CSV or JSON

LinkedIn Profile Scraping using Selenium and BeautifulSoup

Extracting profile details from LinkedIn is a valuable task for marketing, lead generation, talent acquisition, and market research. In this project, you will build a Python scraping script that programmatically logs into your LinkedIn account, navigates to a target profile, and extracts structured fields such as name, location, job history, and education.

To build this tool, you will use Selenium WebDriver for browser automation (including handling login and scrolling to trigger lazy-loaded sections) and BeautifulSoup to parse and structure the HTML content.

Prerequisites and Setup

First, install the required packages:

BASH
pip install selenium beautifulsoup4 lxml

You can review the official documentation for these packages at the following links:

Next, download the browser driver matching your browser. For Google Chrome, retrieve the appropriate binary from the ChromeDriver Downloads page and save it in your project's root folder.

Additionally, create a config.txt file in your project root containing your LinkedIn credentials:

PLAINTEXT
your_linkedin_email@example.com
your_secure_password

For a detailed video demonstration of the environment setup and bot automation, watch the video tutorial below:

This project builds directly on top of the authentication and connection workflows from the LinkedIn Auto Connect Bot tutorial.

The following imports bring in the necessary libraries.

PYTHON
import requests, time, random
from bs4 import BeautifulSoup
from selenium import webdriver

The code below opens the Chrome driver, navigates to the LinkedIn login page, reads credentials from config.txt, and performs the automated login. find_element_by_id() returns the first element matching the given id. send_keys() types text into a field. submit() submits the form.

PYTHON
browser = webdriver.Chrome('driver/chromedriver.exe')
browser.get('https://www.linkedin.com/uas/login')
file = open('config.txt')
lines = file.readlines()
username = lines[0]
password = lines[1]

elementID = browser.find_element_by_id('username')
elementID.send_keys(username)

elementID = browser.find_element_by_id('password')
elementID.send_keys(password)

elementID.submit()

link holds the URL of the profile to scrape. You can target any public profile, or loop over a list of links to scrape multiple profiles.

PYTHON
link = 'https://www.linkedin.com/in/rishabh-singh-61b706114/'
browser.get(link)

Watch Video for this blog:

The full profile is not loaded immediately. Only the visible portion loads on first render, so the script must scroll to the bottom to trigger all lazy-loaded sections. The code below scrolls to the end of the page.

PYTHON
SCROLL_PAUSE_TIME = 5

# Get scroll height
last_height = browser.execute_script("return document.body.scrollHeight")

for i in range(3):
    # Scroll down to bottom
    browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = browser.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

With the full page loaded, retrieve the page source and parse it into a BeautifulSoup object using the lxml parser.

PYTHON
src = browser.page_source
soup = BeautifulSoup(src, 'lxml')

To extract anything from the webpage, inspect it by right-clicking and selecting 'inspect'.

The block containing basic information is represented by a div tag with class flex-1 mr5.

PYTHON
name_div = soup.find('div', {'class': 'flex-1 mr5'})
name_div
OUTPUT
Rishabh Singh


3rd degree connection3rd



  Rishabh has a  account



            #futureshaper


              Bengaluru, Karnataka, India


                  500+ connections



                  Contact info

Inside name_div there are two ul tags. The first ul holds the name, and the second holds the location and connection count.

Fetch both ul tags with name_div.find_all('ul'). Find the li in the first ul using name_loc[0].find('li') and extract its text with get_text().

PYTHON
name_loc = name_div.find_all('ul')
name = name_loc[0].find('li').get_text().strip()
name
OUTPUT
'Rishabh Singh'

For the location, find the li in the second ul.

PYTHON
loc = name_loc[1].find('li').get_text().strip()
loc
OUTPUT
'Bengaluru, Karnataka, India'

The profile title is in the h2 tag, extracted via name_div.find('h2').get_text().

PYTHON
profile_title = name_div.find('h2').get_text().strip()
profile_title
OUTPUT
'#futureshaper'

The connection count is in the second li of the second ul. Find all li tags in the second ul with name_loc[1].find_all('li'), then get the text from the second one.

PYTHON
connection = name_loc[1].find_all('li')
connection = connection[1].get_text().strip()
connection
OUTPUT
'500+ connections'

Append all collected fields to info.

PYTHON
info = []
info.append(link)
info.append(name)
info.append(profile_title)
info.append(loc)
info.append(connection)
info
OUTPUT
['https://www.linkedin.com/in/rishabh-singh-61b706114/', 'Rishabh Singh', '#futureshaper', 'Bengaluru, Karnataka, India', '500+ connections']

Experience

The experience section is accessible via a section tag with id experience-section.

PYTHON
exp_section = soup.find('section', {'id': 'experience-section'})
exp_section
OUTPUT
Experience

FPGA Engineer
Company Name

      Honeywell


Dates Employed
Aug 2019 – Present

Employment Duration
1 yr 2 mos

Location
Bengaluru Area, India

FPGA Design Engineer
Company Name

      L&T Technology Services Limited
        Full-time

Dates Employed
Jan 2017 – Jul 2019

Employment Duration
2 yrs 7 mos

Location
Bengaluru Area, India

From exp_section, get the first ul, then the first div inside it, and then the first a tag inside that div.

PYTHON
exp_section = exp_section.find('ul')
div_tag = exp_section.find('div')
a_tag = div_tag.find('a')
a_tag
OUTPUT
FPGA Engineer
Company Name

      Honeywell


Dates Employed
Aug 2019 – Present

Employment Duration
1 yr 2 mos

Location
Bengaluru Area, India

Extract the job title from the h3 tag.

PYTHON
job_title = a_tag.find('h3').get_text().strip()
job_title
OUTPUT
'FPGA Engineer'

The company name is in the second p tag, accessed via a_tag.find_all('p')[1].get_text().

PYTHON
company_name = a_tag.find_all('p')[1].get_text().strip()
company_name
OUTPUT
'Honeywell'

For the joining date, extract the first h4 tag using a_tag.find_all('h4')[0], then get the second span from it.

PYTHON
joining_date = a_tag.find_all('h4')[0].find_all('span')[1].get_text().strip()
joining_date
OUTPUT
'Aug 2019 – Present'

For the duration, extract the second h4 tag and its second span.

PYTHON
exp = a_tag.find_all('h4')[1].find_all('span')[1].get_text().strip()
exp
OUTPUT
'1 yr 2 mos'

Append the scraped experience data to info.

PLAINTEXT
info
PLAINTEXT
['https://www.linkedin.com/in/rishabh-singh-61b706114/', 'Rishabh Singh', '#futureshaper', 'Bengaluru, Karnataka, India', '500+ connections']
PYTHON
info.append(company_name)
info.append(job_title)
info.append(joining_date)
info.append(exp)
info
OUTPUT
['https://www.linkedin.com/in/rishabh-singh-61b706114/', 'Rishabh Singh', '#futureshaper', 'Bengaluru, Karnataka, India', '500+ connections', 'Honeywell', 'FPGA Engineer', 'Aug 2019 – Present', '1 yr 2 mos']

Education

The education section uses a section tag with id education-section. Retrieve the ul tag inside it to access all education entries.

PYTHON
edu_section = soup.find('section', {'id': 'education-section'}).find('ul')
edu_section
OUTPUT
Technocrats Institute of Technology (Excellence), Anand Nagar, PB No. 24, Post Piplani, BHEL, Bhopal - 462021

Degree Name
Bachelor of Engineering (B.E.)

Field Of Study
Electrical, Electronics and Communications Engineering

Grade
FIRST

Dates attended or expected graduation

2012 – 2016




S.H.S.B.B

Field Of Study
PCM

The college name is directly in the h3 tag.

PYTHON
college_name = edu_section.find('h3').get_text().strip()
college_name
OUTPUT
'Technocrats Institute of Technology (Excellence), Anand Nagar, PB No. 24, Post Piplani, BHEL, Bhopal - 462021'

The degree name is in the second span of the p tag with class pv-entity__secondary-title pv-entity__degree-name t-14 t-black t-normal.

PYTHON
degree_name = edu_section.find('p', {'class': 'pv-entity__secondary-title pv-entity__degree-name t-14 t-black t-normal'}).find_all('span')[1].get_text().strip()
degree_name
OUTPUT
'Bachelor of Engineering (B.E.)'

The field of study is in the second span of the p tag with class pv-entity__secondary-title pv-entity__fos t-14 t-black t-normal.

PYTHON
stream = edu_section.find('p', {'class': 'pv-entity__secondary-title pv-entity__fos t-14 t-black t-normal'}).find_all('span')[1].get_text().strip()
stream
OUTPUT
'Electrical, Electronics and Communications Engineering'

The graduation years are in the second span of the p tag with class pv-entity__dates t-14 t-black--light t-normal.

PYTHON
degree_year = edu_section.find('p', {'class': 'pv-entity__dates t-14 t-black--light t-normal'}).find_all('span')[1].get_text().strip()
degree_year
OUTPUT
'2012 – 2016'

Append the education fields to info.

PLAINTEXT
info
PLAINTEXT
['https://www.linkedin.com/in/rishabh-singh-61b706114/', 'Rishabh Singh', '#futureshaper', 'Bengaluru, Karnataka, India', '500+ connections', 'Honeywell', 'FPGA Engineer', 'Aug 2019 – Present', '1 yr 2 mos']
PYTHON
info.append(college_name)
info.append(degree_name)
info.append(stream)
info.append(degree_year)
info
OUTPUT
['https://www.linkedin.com/in/rishabh-singh-61b706114/', 'Rishabh Singh', '#futureshaper', 'Bengaluru, Karnataka, India', '500+ connections', 'Honeywell', 'FPGA Engineer', 'Aug 2019 – Present', '1 yr 2 mos', 'Technocrats Institute of Technology (Excellence), Anand Nagar, PB No. 24, Post Piplani, BHEL, Bhopal - 462021', 'Bachelor of Engineering (B.E.)', 'Electrical, Electronics and Communications Engineering', '2012 – 2016', 'Technocrats Institute of Technology (Excellence), Anand Nagar, PB No. 24, Post Piplani, BHEL, Bhopal - 462021', 'Bachelor of Engineering (B.E.)', 'Electrical, Electronics and Communications Engineering', '2012 – 2016']

All target data points from the LinkedIn profile have been scraped. This modular script can be extended inside a loop to scrape multiple profiles sequentially.

Warning

Web page structures, dynamic class names, and element IDs on LinkedIn change frequently. If the script fails to locate elements, inspect the target page and update the Selenium selectors accordingly.

Conclusion

In this tutorial, you built a LinkedIn profile scraper using Selenium and BeautifulSoup to extract professional histories, education, locations, and connection stats.

Key takeaways:

  • Constraining scraping to official APIs is always preferred, but browser automation via Selenium is effective for fetching dynamic, lazy-loaded components.
  • Simulating bottom-scrolling behavior ensures that all delayed elements (like historical job details and academic timelines) load properly in the DOM.
  • BeautifulSoup parses structural data reliably using DOM attributes, class properties, and hierarchy nesting.
  • Saving scraped information into lists simplifies downstream processing, such as database insertion or CSV conversion.

Next steps:

  • Read the LinkedIn Auto Connect Bot tutorial to learn how to automate outreach workflows using these scraped profile targets.
  • Extend this script to export the scraped profiles directly into a local .csv file or a relational database for further data analysis.
  • Implement explicit wait structures (WebDriverWait) in Selenium to replace static time.sleep() calls, improving execution speed and reliability.

Found this useful? Keep building with me.

New tutorials every week on YouTube: or go deeper with a full structured course.

Find this tutorial useful?

Subscribe to our YouTube channels for more practical production walk-throughs.

Discussion & Comments