YouTube’s official Data API has strict quotas that limit you to about 100 searches per day. If you’re building a research tool, monitoring trends, or analyzing video metadata at scale, web scraping offers more flexibility.
In this guide, I’ll show you how to build a Go script that scrapes YouTube search results, covering HTTP requests, HTML parsing, and how to avoid detection.
Why Scrape Instead of Using the API
The YouTube Data API costs 100 units per search query from a daily quota of 10,000 units. For most projects that need more than 100 searches daily, these limits become a real problem. Web scraping gives you the data without quota restrictions, though it requires more maintenance.
Setup and Building the Request
You’ll need Go 1.16 or higher and the goquery library for HTML parsing:
mkdir youtube-scraper
cd youtube-scraper
go mod init youtube-scraper
go get -u github.com/PuerkitoBio/goqueryThe key to successful scraping is looking like a real browser. YouTube checks your User-Agent header and will block or return different content if you look like a bot. Here’s how to build a proper request:
package main
import (
"fmt"
"log"
"net/http"
"net/url"
"github.com/PuerkitoBio/goquery"
)
func main() {
query := "golang tutorial"
encodedQuery := url.QueryEscape(query)
searchURL := "https://www.youtube.com/results?search_query=" +
encodedQuery + "&hl=en&gl=US"
client := &http.Client{}
req, err := http.NewRequest("GET", searchURL, nil)
if err != nil {
log.Fatalf("Error creating request: %v", err)
}
req.Header.Set("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36")
req.Header.Set("Accept-Language", "en-US,en;q=0.9")
resp, err := client.Do(req)
if err != nil {
log.Fatalf("Request failed: %v", err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
log.Fatalf("Non-OK HTTP status: %d", resp.StatusCode)
}
}The &hl=en&gl=US parameters set the language to English and region to US for consistent results.
Parsing and Extracting Video Data
YouTube wraps each video result in a <ytd-video-renderer> tag. Use goquery to parse the HTML and extract data:
doc, err := goquery.NewDocumentFromReader(resp.Body)
if err != nil {
log.Fatalf("Failed to parse HTML: %v", err)
}
doc.Find("ytd-video-renderer").Each(func(index int, item *goquery.Selection) {
title := strings.TrimSpace(item.Find("a#video-title").Text())
videoHref, exists := item.Find("a#video-title").Attr("href")
if !exists {
return
}
videoURL := "https://www.youtube.com" + videoHref
channelName := strings.TrimSpace(item.Find("#channel-name a").Text())
channelHref, _ := item.Find("#channel-name a").Attr("href")
channelURL := "https://www.youtube.com" + channelHref
meta := item.Find("#metadata-line .inline-metadata-item")
views := strings.TrimSpace(meta.Eq(0).Text())
published := strings.TrimSpace(meta.Eq(1).Text())
thumbnail, _ := item.Find("img").Attr("src")
duration := strings.TrimSpace(item.Find("ytd-thumbnail-overlay-time-status-renderer span").Text())
fmt.Printf("Video #%d:\n", index+1)
fmt.Printf(" Title: %s\n", title)
fmt.Printf(" URL: %s\n", videoURL)
fmt.Printf(" Channel: %s (%s)\n", channelName, channelURL)
fmt.Printf(" Views: %s | Published: %s\n", views, published)
fmt.Printf(" Thumbnail: %s\n", thumbnail)
fmt.Printf(" Duration: %s\n\n", duration)
})Use strings.TrimSpace() on all extracted text since YouTube’s HTML includes extra whitespace.
Handling Pagination
A single request returns about 20 videos. For more results, use YouTube’s internal API at https://www.youtube.com/youtubei/v1/search with this JSON payload:
{
"context": {
"client": {
"clientName": "WEB",
"clientVersion": "2.20250620.01.00",
"hl": "en",
"gl": "US"
}
},
"query": "golang tutorial"
}The response includes a continuation token for fetching the next batch. This is more efficient than HTML scraping since you get structured JSON. Note that clientVersion changes frequently, so check YouTube’s network requests in your browser’s developer tools to get the current value.
Avoiding Detection
Here’s what I’ve learned about staying under the radar:
- Rate limiting: Add 2-5 second delays between requests using
time.Sleep()to mimic human behavior. - User-Agent rotation: Maintain a list of common browser User-Agents and rotate through them instead of using the same one.
- Proxies: For large-scale scraping, rotate proxies to avoid per-IP limits. Residential proxies work better than datacenter proxies.
- Error handling: If you get a 429 status code, implement exponential backoff. Wait longer between each retry.
Complete Working Script
Here’s the full implementation:
package main
import (
"fmt"
"log"
"net/http"
"net/url"
"strings"
"time"
"github.com/PuerkitoBio/goquery"
)
func scrapeYouTube(query string) error {
encodedQuery := url.QueryEscape(query)
searchURL := "https://www.youtube.com/results?search_query=" +
encodedQuery + "&hl=en&gl=US"
client := &http.Client{
Timeout: 30 * time.Second,
}
req, err := http.NewRequest("GET", searchURL, nil)
if err != nil {
return fmt.Errorf("error creating request: %w", err)
}
req.Header.Set("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36")
req.Header.Set("Accept-Language", "en-US,en;q=0.9")
resp, err := client.Do(req)
if err != nil {
return fmt.Errorf("request failed: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
return fmt.Errorf("non-OK HTTP status: %d", resp.StatusCode)
}
doc, err := goquery.NewDocumentFromReader(resp.Body)
if err != nil {
return fmt.Errorf("failed to parse HTML: %w", err)
}
resultCount := 0
doc.Find("ytd-video-renderer").Each(func(index int, item *goquery.Selection) {
title := strings.TrimSpace(item.Find("a#video-title").Text())
if title == "" {
return
}
videoHref, exists := item.Find("a#video-title").Attr("href")
if !exists {
return
}
videoURL := "https://www.youtube.com" + videoHref
channelName := strings.TrimSpace(item.Find("#channel-name a").Text())
channelHref, _ := item.Find("#channel-name a").Attr("href")
channelURL := "https://www.youtube.com" + channelHref
meta := item.Find("#metadata-line .inline-metadata-item")
views := strings.TrimSpace(meta.Eq(0).Text())
published := strings.TrimSpace(meta.Eq(1).Text())
thumbnail, _ := item.Find("img").Attr("src")
duration := strings.TrimSpace(item.Find("ytd-thumbnail-overlay-time-status-renderer span").Text())
resultCount++
fmt.Printf("Result #%d:\n", resultCount)
fmt.Printf(" Title: %s\n", title)
fmt.Printf(" URL: %s\n", videoURL)
fmt.Printf(" Channel: %s (%s)\n", channelName, channelURL)
fmt.Printf(" Views: %s | Published: %s\n", views, published)
fmt.Printf(" Thumbnail: %s\n", thumbnail)
fmt.Printf(" Duration: %s\n\n", duration)
})
if resultCount == 0 {
return fmt.Errorf("no results found")
}
fmt.Printf("Total results found: %d\n", resultCount)
return nil
}
func main() {
query := "golang tutorial"
fmt.Printf("Searching YouTube for: %s\n\n", query)
if err := scrapeYouTube(query); err != nil {
log.Fatalf("Error: %v", err)
}
}Storing the Data
Create a struct to represent each video and save to JSON:
type Video struct {
Title string
URL string
ChannelName string
ChannelURL string
Views string
Published string
Thumbnail string
Duration string
}
videos := []Video{}
// populate videos slice during scraping
jsonData, err := json.MarshalIndent(videos, "", " ")
if err != nil {
log.Fatal(err)
}
err = os.WriteFile("youtube_results.json", jsonData, 0644)
if err != nil {
log.Fatal(err)
}Using Managed Services
If you don’t want to maintain scraping infrastructure, consider services like Decodo that offer APIs for extracting YouTube data. They handle proxy rotation, selector updates, and rate limits. The trade-off is cost, but the time saved often makes it worthwhile.
I’ve also found that a hybrid approach works well: use a service like Decodo for production features and keep a lightweight scraper for development or backup.
Final Thoughts
Scraping YouTube with Go is straightforward once you understand the HTML structure and how to avoid blocks. The combination of net/http and goquery gives you powerful tools for data extraction.
Keep in mind that YouTube’s Terms of Service prohibit automated access, so use this responsibly for personal research or when the API doesn’t meet your needs. The HTML structure changes regularly, so expect to update your selectors occasionally by inspecting the page with browser developer tools.
For production applications, consider combining scraping with the official API or using a managed service. This gives you reliability without being completely dependent on one approach.
Jason Moth
Related posts
Popular Articles
Best Linux Distros for Developers and Programmers as of 2025
Linux might not be the preferred operating system of most regular users, but it’s definitely the go-to choice for the majority of developers and programmers. While other operating systems can also get the job done pretty well, Linux is a more specialized OS that was…
How to Install Pip on Ubuntu Linux
If you are a fan of using Python programming language, you can make your life easier by using Python Pip. It is a package management utility that allows you to install and manage Python software packages easily. Ubuntu doesn’t come with pre-installed Pip, but here…
