A modern, powerful web scraping application built with Next.js and Firecrawl. This application allows you to scrape individual URLs or crawl entire websites, converting the content into clean markdown format with a beautiful, responsive UI.
For reliable web scraping, especially when dealing with rate limits or IP blocks, we recommend using Toolip.io Premium Proxies.
- 🚀 High-performance proxy servers
- 🌍 Multiple global locations
- ⚡ Ultra-fast response times
- 🔒 Secure and anonymous
- 🎯 Perfect for web scraping projects
- 💪 Reliable uptime
- 🔄 Automatic rotation
- 🛡️ Advanced IP protection
- Bypass rate limiting
- Avoid IP blocks
- Access geo-restricted content
- Enhance scraping reliability
- Improve success rates
- Scale your scraping operations
- 🌐 Single URL scraping with instant results
- 🕷️ Full website crawling with configurable depth
- 📝 Clean markdown output
- 🎨 Modern, responsive UI with dark mode
- ⚡ Real-time status updates
- 🔄 Automatic polling for crawl status
- 💅 Beautiful glassmorphism design
- ♿ Fully accessible components
- Framework: Next.js 14
- Language: TypeScript
- Styling: TailwindCSS
- Icons: Lucide Icons
- API: Firecrawl
- Animations: TailwindCSS Animate
- Deployment: Vercel (recommended)
Before you begin, ensure you have the following installed:
- Node.js 18.x or higher
- npm or yarn package manager
- Git
# Clone the repository
git clone https://github.com/yourusername/webscrapper-firecrawl.git
# Navigate to the project directory
cd webscrapper-firecrawl
# Using npm
npm install
# Using yarn
yarn install
Create a .env.local
file in the root directory:
NEXT_PUBLIC_FIRECRAWL_API_KEY=your_firecrawl_api_key
Replace your_firecrawl_api_key
with your actual Firecrawl API key. You can get one at Firecrawl's website.
# Using npm
npm run dev
# Using yarn
yarn dev
The application will be available at http://localhost:3000
.
-
Single URL Scraping:
- Enter the target URL in the input field
- Click "Scrape URL"
- View the results in markdown and HTML format
-
Website Crawling:
- Enter the website's URL
- Click "Crawl Website"
- Monitor the progress in real-time
- View all scraped pages in the results
webscrapper-firecrawl/
├── src/
│ ├── app/
│ │ ├── layout.tsx # Root layout component
│ │ ├── page.tsx # Main application page
│ │ └── globals.css # Global styles
│ └── components/ # Reusable components
├── public/ # Static assets
├── tailwind.config.ts # TailwindCSS configuration
├── next.config.js # Next.js configuration
└── package.json # Project dependencies
You can modify the scraping and crawling options in src/app/page.tsx
:
// Scraping options
const scrapeOptions = {
formats: ['markdown', 'html']
};
// Crawling options
const crawlOptions = {
limit: 100,
scrapeOptions: { formats: ['markdown', 'html'] }
};
Variable | Description | Required |
---|---|---|
NEXT_PUBLIC_FIRECRAWL_API_KEY |
Your Firecrawl API key | Yes |
The easiest way to deploy your Next.js app is to use Vercel.
- Push your code to GitHub
- Import your repository to Vercel
- Add your environment variables
- Deploy!
You can also deploy to other platforms like:
- Netlify
- AWS Amplify
- Digital Ocean
- Self-hosted
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Firecrawl for the amazing API
- Next.js for the awesome framework
- TailwindCSS for the styling utilities
- Lucide for the beautiful icons
For support, email [email protected] or open an issue in the GitHub repository.
Made with ❤️ by [Your Name]