B

Browser Automation Kit

Powerful skill for browser, automation, powers, testing. Includes structured workflows, validation checks, and reusable patterns for utilities.

SkillClipticsutilitiesv1.0.0MIT
0 views0 copies

Browser Automation Kit

A comprehensive browser automation skill for building web scraping pipelines, UI testing workflows, and automated browser interactions using Playwright, Puppeteer, and Selenium.

When to Use

Choose Browser Automation when:

  • Automating repetitive browser tasks like form filling and data extraction
  • Building end-to-end UI test suites that simulate real user interactions
  • Scraping dynamic web pages that require JavaScript rendering
  • Monitoring web applications for visual regressions or uptime

Consider alternatives when:

  • Scraping static HTML pages — use HTTP requests with cheerio or Beautiful Soup
  • Testing APIs directly — use HTTP client testing tools
  • Simple UI testing — use a lighter framework like Cypress

Quick Start

# Install Playwright with browsers npm init playwright@latest # or pip install playwright && playwright install
import { chromium, type Page } from 'playwright'; async function automateWorkflow() { const browser = await chromium.launch({ headless: true }); const context = await browser.newContext({ viewport: { width: 1280, height: 720 }, userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36' }); const page = await context.newPage(); // Navigate and wait for network idle await page.goto('https://example.com', { waitUntil: 'networkidle' }); // Fill form fields await page.fill('input[name="username"]', 'testuser'); await page.fill('input[name="password"]', 'testpass'); await page.click('button[type="submit"]'); // Wait for navigation and extract data await page.waitForURL('**/dashboard'); const data = await page.evaluate(() => { const items = document.querySelectorAll('.data-item'); return Array.from(items).map(el => ({ title: el.querySelector('h3')?.textContent?.trim(), value: el.querySelector('.value')?.textContent?.trim() })); }); console.log('Extracted:', data); await browser.close(); return data; } // Web scraper with pagination async function scrapeWithPagination(baseUrl: string) { const browser = await chromium.launch(); const page = await browser.newPage(); const allItems: any[] = []; let currentPage = 1; let hasNextPage = true; while (hasNextPage) { await page.goto(`${baseUrl}?page=${currentPage}`, { waitUntil: 'domcontentloaded' }); const items = await page.$$eval('.item', elements => elements.map(el => ({ title: el.querySelector('.title')?.textContent?.trim(), price: el.querySelector('.price')?.textContent?.trim(), link: el.querySelector('a')?.href })) ); allItems.push(...items); hasNextPage = await page.$('.next-page:not(.disabled)') !== null; currentPage++; await page.waitForTimeout(1000); // Rate limiting } await browser.close(); return allItems; }

Core Concepts

Automation Framework Comparison

FeaturePlaywrightPuppeteerSeleniumCypress
BrowsersChromium, Firefox, WebKitChromiumAll majorChromium, Firefox
LanguagesJS, Python, Java, .NETJSJava, Python, JS, C#JS/TS
Auto-waitBuilt-inManualManualBuilt-in
Network InterceptFullFullLimitedFull
Mobile EmulationExcellentGoodVia driversLimited
ParallelNativeManualGridLimited
SpeedFastFastSlowerFast

Page Object Model Pattern

class LoginPage { constructor(private page: Page) {} async navigate() { await this.page.goto('/login'); } async login(username: string, password: string) { await this.page.fill('#username', username); await this.page.fill('#password', password); await this.page.click('#login-button'); await this.page.waitForURL('**/dashboard'); return new DashboardPage(this.page); } async getErrorMessage() { return this.page.textContent('.error-message'); } } class DashboardPage { constructor(private page: Page) {} async getUserName() { return this.page.textContent('.user-name'); } async getStats() { return this.page.$$eval('.stat-card', cards => cards.map(card => ({ label: card.querySelector('.label')?.textContent, value: card.querySelector('.value')?.textContent })) ); } }

Configuration

OptionDescriptionDefault
browserBrowser engine: chromium, firefox, webkit"chromium"
headlessRun without visible browser windowtrue
viewport_widthBrowser viewport width1280
viewport_heightBrowser viewport height720
timeoutDefault operation timeout (ms)30000
retry_countNumber of retries on failure2
screenshot_on_failureCapture screenshot when tests failtrue
video_recordingRecord browser session videofalse

Best Practices

  1. Use Playwright's auto-waiting instead of manual waitForTimeout calls — Playwright automatically waits for elements to be actionable before interacting, making tests more reliable and faster than hardcoded delays
  2. Implement the Page Object Model to encapsulate page-specific selectors and interactions in classes, making tests readable and maintainable when the UI changes
  3. Use data-testid attributes for test selectors instead of CSS classes or XPath — test IDs survive UI redesigns and clearly signal that an element is used by automation
  4. Intercept network requests for testing instead of relying on real APIs — page.route() lets you mock API responses to test specific scenarios like errors, empty states, and edge cases deterministically
  5. Run tests in parallel across browsers using Playwright's built-in test runner with projects configured for Chromium, Firefox, and WebKit to catch browser-specific issues early

Common Issues

Flaky tests due to timing issues: Tests that pass locally but fail in CI often have race conditions where elements are not ready when the test interacts with them. Replace waitForTimeout with explicit waiters like waitForSelector, waitForResponse, or Playwright's built-in auto-waiting that checks element visibility and actionability.

CAPTCHA and bot detection blocking automation: Many websites detect headless browsers and serve CAPTCHAs. For authorized testing, use stealth plugins, configure realistic browser fingerprints, and set up allowlisting with the site owner. For production scraping, consider using official APIs instead.

Memory leaks in long-running scraping sessions: Browser contexts accumulate memory over thousands of page visits. Create fresh browser contexts periodically, close pages after extraction, and implement batching that restarts the browser every N pages to keep memory usage manageable.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates