How to download and parse HTML page in Go
This example uses goquery to request a HTML page (https://techoverflow.net) via the Go net/http
client and then uses goquery
and a simple CSS-style query to select the <title>...</title>
HTML tag and print it’s content.
package main
import (
"fmt"
"log"
"net/http"
"github.com/PuerkitoBio/goquery"
)
func main() {
// Perform request
resp, err := http.Get("https://techoverflow.net")
if err != nil {
print(err)
return
}
// Cleanup when this function ends
defer resp.Body.Close()
// Read & parse response data
doc, err := goquery.NewDocumentFromReader(resp.Body)
if err != nil {
log.Fatal(err)
}
// Print content of <title></title>
doc.Find("title").Each(func(i int, s *goquery.Selection) {
fmt.Printf("Title of the page: %s\n", s.Text())
})
}
Example output:
Title of the page: TechOverflow