Web Scraping with Python

This Python script is used for web scraping data from the Universitas Airlangga (Unair) faculty directory. The script fetches faculty URLs, generates pages URLs for each faculty, and extracts information from lecturer pages. The collected data is then saved to a CSV file.

Prerequisites

Before running the script, ensure you have the necessary packages installed. You can install them using pip:

pip install requests beautifulsoup4

Usage

Update the BASE_URL constant with the URL of the Unair faculty directory.
Set the desired TIMEOUT value for requests.
Run the script using Python:

python your_script_name.py

Script Explanation

extract_text: A function to extract and clean text from an element.
extract_faculties: A function to extract faculty URLs from the main page.
extract_pages: A function to generate pages URLs for each faculty.
extract_dosen_pages: A function to extract lecturer information from their respective pages.
The script initializes a session and sends a GET request to the specified BASE_URL.
It extracts faculty URLs, generates pages URLs for each faculty, and extracts information from lecturer pages.
The collected data is saved to a CSV file named data_dosen_unair.csv.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments


Make sure to customize the README with appropriate file names, paths, and additional information if needed.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
data_dosen_unair.csv		data_dosen_unair.csv
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraping with Python

Prerequisites

Usage

Script Explanation

License

Acknowledgments

About

Releases

Packages

Languages

dms-codes/scrape_dosen_unair

Folders and files

Latest commit

History

Repository files navigation

Web Scraping with Python

Prerequisites

Usage

Script Explanation

License

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages